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Field of the Invention 

The present invention relates to toxins isolated from 
bacteria and the use of said toxins as insecticides. 

Background of the Invention 



Many insects are widely regarded as pests to homeowners, to 
picnickers, to gardeners, and to fanners and others whose 
investments in agricultural products are often destroyed or 
diminished as a result of insect damage to field crops. 
Particularly in areas where the growing season is short, 
significant insect damage can mean the loss of all profits to 
growers and a dramatic decrease in crop yield. Scarce supply of 
particular agricultural products invariably results in higher 
costs to food processors and, then, to the ultimate consumers of 
food plants and products derived from those plants. 

Preventing insect damage to crops and flowers and 
eliminating the nuisance of insect pests have typically relied on 
35 strong organic pesticides and insecticides with broad toxicities. 
These synthetic products have come under attack by the general 
population as being too harsh on the environment and on those 
exposed to such agents. Similarly in non-agricultural settings, 
homeowners would be satisfied to have insects avoid their homes 
40 or outdoor meals without needing to kill the insects. 



SUBSTITUTE SHEET (RULE 26) 



WO 97/1 7432 PCT/US96/1 8003 

The extensive use of chemical insecticides has raised 
environmental and health concerns for farmers, companies that 
produce the insecticides, government agencies, public interest 
groups, and the public in general. The development of less 
5 intrusive pest management strategies has been spurred along both 
by societal concern for the environment and by the development of 
biological tools which exploit mechanisms of insect management. 
Biological control agents present a promising alternative to 
chemical insecticides. 

10 Organisms at every evolutionary development level have 

devised means to enhance their own success and survival. The use 
of biological molecules as tools of defense and aggression is 
known throughout the animal and plant kingdoms. In addition, the 
relatively new tools of the genetic engineer allow modifications 

15 to biological insecticides to accomplish particular solutions to 
particular problems. 

One such agent, Bacillus thuringiensis (fit), is an effective 
insect icidal agent, and is widely commercially used as such. In 
fact, the insecticidal agent of the fit bacterium is a protein 

20 which has such limited toxicity, it can be used on human food 
crops on the day of harvest. To non-targeted organisms, the Be 
toxin is a digestible non-toxic protein. 

Another known class of biological insect control agents are 
certain genera of nematodes known to be vectors of transmission 

25 for insect -killing bacterial symbionts. Nematodes containing 
insecticidal bacteria invade insect larvae. The bacteria then 
kill the larvae. The nematodes reproduce in the larval cadaver. 
The nematode progeny then eat the cadaver from within. The 
bacteria-containing nematode progeny thus produced can then 

30 invade additional larvae. 

In the past, insecticidal nematodes in the Sceinernema and 
Hecerorhabditis genera were used as insect control agents. 
Apparently, each genus of nematode hosts a particular species of 
bacterium. In nematodes of the Heterorhabditis genus, the 

35 symbiotic bacterium is Photorhabdus luminescens . 

Although these nematodes are effective insect control 
agents, it is presently difficult, expensive, and inefficient to 
produce, maintain, and distribute nematodes for insect control. 
It has been known in the art that one may isolate an 

40 insecticidal toxin from Photorhabdus luminescens that has 
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activity only when injected into Lepidopteran and Coieopteran 
insect larvae. This has made it impossible to effectively 
exploit the insecticidal properties of the nematode or its 
bacterial symbiont. What would be useful would be a more 
5 practical, less labor-intensive wide-area delivery method of an 
insecticidal toxin which would retain its biological properties 
after delivery. It would be quite desirous to discover toxins 
with oral activity produced by the genus Photorhabdus . The 
isolation and use of these toxins are desirous due to efficacious 
10 reasons. Until applicants' discoveries, these toxins had not 
been isolated or characterized. 

Summary of the Invention 



15 The native toxins are protein complexes that are produced 

and secreted by growing bacteria cells of the genus Photorhabdus , 
of interest are the proteins produced by the species Photorhabdus 
luminescens. The protein complexes, with a molecular size of 
approximately 1,000 kDa, can be separated by SDS-PAGE gel 

20 analysis into numerous component proteins. The toxins contain no 
hemolysin, lipase, type C phospholipase , or nuclease activities. 
The toxins exhibit significant toxicity upon exposure 
administration to a number of insects. 

The present invention provides an easily administered 

25 insecticidal protein as well as the expression of toxin in a 
heterologous system. 

The present invention also provides a method for delivering 
insecticidal toxins that are functional active and effective 
against many orders of insects. 

30 Objects, advantages, and features of the present invention 

will become apparent from the following specification. 

Brief Description of the Drawings 

35 Fig. 1 is an illustration of a match of cloned DNA isolates 

used as a part of sequence genes for the toxin of the present 
invention. 

Fig. 2 is a map of three plasmids used in the sequencing 
process . 
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Fig. 3 is a map illustrating the inter-relationship of 
several partial DNA fragments. 

Fig. 4 is an illustration of a homology analysis between the 
protein sequences of TcbAii and TcaBij proteins. 
5 Fig. 5 is a phenogram of Photorhabdus strains. Relationship 

of Photorhabdus Strains was defined by rep-PCR. 
The upper axis of Fig. 5 measures the percentage similarity of 
strains based on scoring of rep-PCR products (i.e., 0.0 [no 
similarity] to 1.0 (100% similarity]). At the right axis, the 

10 numbers and letters indicate the various strains tested; 14=w-l4, 
Hm=Hm, H9=H9, 7=WX-7, 1=WX-1 , 2=WX-2, 88=HP88, NC-1=NC-1, 4=WX-4, 
9=WX-9, 8=WX-8 , 10=WX-10, WIR=WIR, 3=WX-3 , 11=WX-11, 5=WX-5, 
6=WX-6. 12=WX-12, xl4=WX-14. 15=WX-15, Hb=Hb, B2=B2, 48 through 
52=ATCC 43948 through ATCC 43952. Vertical lines separating 

15 horizontal lines indicate the degree of relatedness (as read from 
the extrapolated intersection of the vertical line with the upper 
axis) between strains or groups of strains at the base of the 
horizontal lines (e.g., strain W-14 is approximately 60% similar 
to strains H9 and Hm) . 

20 Fig. 6 is an illustration of the genomic maps of the W-14 

Strain. 

Detailed Description of the Invention 

25 The present inventions are directed to the discovery c £ a 

unique class of insecticidal protein toxins from the genus 
Photorhabdus that have oral toxicity against insects. A unique 
feature of Photorhabdus is its bioluminescence . Photorhabdus may 
be isolated from a variety of sources. One such source is 

30 nematodes, more particularly nematodes of the genus 

Heterorhabditis . Another such source is from human clinical 
samples from wounds, see Farmer et al. 1989 J. Clin. Microbiol. 
27 pp. 1594-1600. These saprohytic strains are deposited in the 
American Type Culture Collection (Rockville, MD) ATCC #s 43948, 

35 43949, 43950, 43951, and 43952, and are incorporated herein by- 
reference. It is possible that other sources could harbor 
Photorhabdus bacteria that produce insecticidal toxins. Such 
sources in the environment could be either terrestrial or aquatic 
based. 
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The genus Phocorhabdus is taxonomically defined as a memb^i 
of Che Family Encerobacceriaceae. although it has certain trait.- 
atypical of this family. For example, strains of this genus are 
nitrate reduction negative, yellow and red pigment producing and 
5 bioluminescent . This latter trait is otherwise unknown within 
the Enterobacteriaceae. Photorhabdus has only recently been 
described as a genus separate from the Xenorhabdus (Boemare 
et al., 1993 Int. J. Syst . Bacterid. 43, 249-255). This 
differentiation is based on DNA-DNA hybridization studies, 
10 phenotypic differences (e.g., presence (Phocorhabdus) or absence 
(Xenorhabdus) of catalase and bioluminescence) and the Family of 
the nematode host (Xenorhabdus; sceinernemac idae , Phocorhabdus; 
Hecerorhabdic idae) . Comparative, cellular fatty-acid analyses 
(Janse et al. 1990, Lett. Appl . Microbiol 10, 131-135; Suzuki 

15 et al. 1990, J. Gen. Appl. Microbiol., 36, 393-401) support the 
separation of Phocorhabdus from Xenorhabdus. 

In order to establish that the strain collection disclosed 
herein was comprised of Phocorhabdus strains, the strains were 
characterized based on recognized traits which define 

20 Phocorhabdus and differentiate it from other Encerobacceriaceae 
and Xenorhabdus species. (Farmer, 1984 Bergey ' s Manual of 
Systemic Bacteriology Vol. 1 pp. 510-511; Akhurst and Boemare 1988, 
J. Gen. Microbiol. 134 pp. 1835-1845; Boemare et al. 1993 Int. J. 
Syst. Bacteriol. 43 pp. 249-255, which are incorporated herein by 

25 reference) . The traits studied were the following: gram stain 
negative rods, organism size, colony pigmentation, inclusion 
bodies, presence of catalase, ability to reduce nitrate, 
bioluminescence, dye uptake, gelatin hydrolysis, growth on 
selective media, growth temperature, survival under anerobic 

30 conditions and motility. Fatty acid analysis was used to confirm 
that the strains herein all belong to the single genus 
PhotorhaJbdus . 

Currently, the bacterial genus Phocorhabdus is comprised of 
a single defined species, Phocorhabdus luminescens (ATCC Type 
35 strain #29999, Poinar et al., 1977, Nematologica 23, 97-102). A 
variety of related strains have been described in the literature 
(e.g. Akhurst et al. 1988 J. Gen. Microbiol., 134, 1835-1845; 
Boemare et al . 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255; Putz 
et al. 1990, Appl. Environ. Microbiol., 56, 181-186). Numerous 
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Photorhabdus scrains have been characterized herein. Such 
strains are listed in Table 18 in the Examples. Because there is 
currently only one species ( lu/ninescens) defined within the genus 
Photorhabdus, the luminescens species traits were used to 
5 characterize the strains herein. As can be seen in Fig. 5, these 
strains are quite diverse. It is not unforeseen that in the 
future there may be other Photorhabdus species that will have 
some of the attributes of the luminescens species as well as some 
different characteristics that are presently not defined as a 

10 trait of Photorhabdus luminescens . However, the scope of the 

invention herein is to any Photorhabdus species or strains which 
produce proteins that have functional activity as insect control 
agents, regardless of other traits and characteristics. 

Furthermore, as is demonstrated herein, the bacteria of the 

15 genus Photorhabdus produce proteins that have functional activity 
as defined herein. Of particular interest are proteins produced 
by the species Photorhabdus luminescens . The inventions herein 
should in no way be limited to the strains which are disclosed 
herein. These strains illustrate for the first time that 

20 proteins produced by diverse isolates of Photorhabdus are toxic 
upon exposure to insects. Thus, included within the inventions 
described herein are the strains specified herein and any mutants 
thereof, as well as any strains or species of the genus 
Photorhabdus that have the functional activity described herein. 

25 There are several terms that are used herein that have a 

particular meaning and are as follows: 

By 'functional activity" it is meant herein that the protein 
toxins function as insect control agents in that the proteins are 

30 orally active, or have a toxic effect, or are able to disrupt or 
deter feeding, which may or may not cause death of the insect. 
When an insect comes into contact with an effective amount of 
toxin delivered via transgenic plant expression, formulated 
protein compositions ( s ) , sprayable protein compos it ion (s) , a bait 

35 matrix or other delivery system, the results are typically death 
of the insect, or the insects do not feed upon the source which 
makes the toxins available to the insects. 
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The protein toxins discussed herein are typically referred to as 
"insecticides". By insecticides it is meant herein that the 
protein toxins have a "functional activity" as further defined 
herein and are used as insect control agents. 

5 

By the use of the term "oligonucleotides" it is meant a 
macromolecule consisting of a short chain of nucleotides of 
either RNA or DNA. Such length could be at least one nucleotide, 
but typically are in the range of about 10 to about 12 
10 nucleotides. The determination of the length of the 

oligonucleotide is well within the skill of an artisan and should 
not be a limitation herein. Therefore, oligonucleotides may be 
less than 10 or greater than 12. 

15 By the use of the term "toxic" or "toxicity" as used herein it is 
meant that the toxins produced by Photorhabdus have "functional 
activity" as defined herein. 

By the use of the term "genetic material" herein, it is meant to 
20 include all genes, nucleic acid, DNA and RNA. 

Fermentation broths from selected strains reported in 
Table 18 were used to determine the following: breadth of 
insect icidal toxin production by the Photorhabdus genus, the 

25 insecticidal spectrum of these toxins, and to provide source 
material to purify the toxin complexes. The strains 
characterized herein have been shown to have oral toxicity 
against a variety of insect orders. Such insect orders include 
but are not limited to Coleoptera, Homopcera, Lepidoptera, 

30 Diptera. Acarina, Hymenopcera and Dictyoptera. 

As with other bacterial toxins, the rate of mutation of the 
bacteria in a population causes many related toxins slightly 
different in sequence to exist. Toxins of interest here are 

35 those which produce protein complexes toxic to a variety of 
insects upon exposure, as described herein. Preferably, the 
toxins are active against Lepidopcera, Coleoptera, Homopotera, 
Diptera, Hymenoptera , Dictyoptera and Acarina. The inventions 
herein are intended to capture the protein toxins homologous to 

40 protein toxins produced by the strains herein and any derivative 

-7- 
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By the use of the term « Photorhabdua toxin" it is meant any 
protein produced by a Photorhabdua microorganism strain 
which has functional activity against insects, where the 
PhotorhaJbdus toxin could be formulated as a sprayable 
composition, expressed by a transgenic plant, formulated as 
a bait matrix, delivered via a Baculovirus, or delivered by 
any other applicable host or delivery system. 
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strains thereof, as well as any protein toxins produced by 
Phocorhabdus . These homologous proteins may differ in sequence, 
but do not differ in function from those toxins described herein. 
Homologous toxins are meant to include protein complexes of 
5 between 300 kDa to 2,000 kDa and are comprised of at least tv. • 
(2) subunits, where a subunit is a peptide which may or may not 
be the same as the other subunit. Various protein subunits have 
been identified and are taught in the Examples herein. 
Typically, the protein subunits are between about 18 kDa to about 
10 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to 160 
kDa; about 80 kDa to about 100 kDa; and about 50 kDa to about 80 
kDa. 

As discussed above, some Phocorhabdus strains can be 
isolated from nematodes. Some nematodes, elongated cylindrical 
15 parasitic worms of the phylum Nemacoda, have evolved an ability 
to exploit insect larvae as a favored growth environment. The 
insect larvae provide a source of food for growing nematodes and 
an environment in which to reproduce. One dramatic effect that 
follows invasion of larvae by certain nematodes is larval death. 

20 Larval death results from the presence of, in certain nematodes, 
bacteria that produce an insecticidal toxin which arrests larval 
growth and inhibits feeding activity. 

Interestingly, it appears that each genus of insect 
parasitic nematode hosts a particular species of bacterium, 

25 uniquely adapted for symbiotic growth with that nematode. In the 
interim since this research was initiated, the name of the 
bacterial genus Xenorhabdus was reclassified into the Xenorhabdus 
and the Phocorhabdus . Bacteria of the genus Phocorhabdus are 
characterized as being symbionts of Hecerorhabdicus nematodes 

30 while Xenorhabdus species are symbionts of the Sceinernema 
species. This change in nomenclature is reflected in this 
specification, but in no way should a change in nomenclature 
alter the scope of the inventions described herein. 

The peptides and genes that are disclosed herein are named 

35 according to the guidelines recently published in the Journil of 
Bacteriology "Instructions to Authors* p. i-xii (Jan. 1996), 
which is incorporated herein by reference. The following 
peptides and genes were isolated from Phocorhabdus strain W-14. 
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Peptide / Gene Nomenclature 
Toxin complex (Tc) 



Pept ide 
Name 



Gene 
Name 



Patent 
Sequence ID# 



25 



30 



tea genomic region 
Tea A 
TcaAiii 
10 TcaBi 
TcaBii 
TcaC 

tcb genomic region 
1 5 Tcb A 
TcbAi 
TcbAii 
TcbAi ii 

20 ccc genomic region 
TccA 
TccB 



ted genomic region 

TcdAi 

TcdAii 

TcdAi ii 
TcdB 



tcaA 
tcaA 
tcaB 
tcaB 
tcaC 



tcbA 
tcbA 
tcbA 
tcbA 



tec A 
tccB 



tcdA 
ccdA 

tcdA 
tcdB 



12 
4 

3 (19, 

5 

2 



20 > 



16 

(pro-peptide) 
1 (21. 22. 23. 
40 



(pro-peptide] 
13, (38, 39 
18) 

(42, 43) 



17 
41 
14 



24) 



(bracket sequence indicates internal amino acid sequence obtained 
by tryptic digests) 



35 Tn e sequences listed above are grouped by genomic region. 

The tcbA gene was expressed in E. coli as two protein fragments 
TcbA and TcbAiii as illustrated in the Examples. it may be 
beneficial to have proteolytic clippage of some sequences to 
obtain the higher activity of the toxins for commercial 

40 transgenic applications. 

The toxins described herein are quite unique in that the 
toxins have functional activity, which is key to developing an 
insect management strategy. In developing an insect management 

45 strategy, it is possible to delay or circumvent the protein 
degradation process by injecting a protein directly into an 
organism, avoiding its digestive tract. In such cases, the 
protein administered to the organism will r tain its function 
until it is denatured, non-specif ically degraded, or eliminated 

50 by the immune system in higher organisms. Injection into insects 

-9- 
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of an insect icidal toxin has potential application only in the 
laboratory, and then only on large insects which are easily 
injected. The observation that the insecticidal protein toxins 
herein described exhibits their toxic activity after oral 
5 ingestion or contact with the toxins permits the development of 
an insect management plan based solely on the ability to 
incorporate the protein toxins into the insect diet . Such a plan 
could result in the production of insect baits. 

The Phocorhabdus toxins may be administered to insects in a 
10 purified form. The toxins may also be delivered in amounts from 
about 1 to about 100 mg / liter of broth. This may vary upon 
formulation condition, conditions of the inoculum source, 
techniques for isolation of the toxin, and the like. The toxins 
may be administered as an exudate secretion or cellular protein 
15 originally expressed in a heterologous prokaryotic or eukaryotic 
host. Bacteria are typically the hosts in which proteins are 
expressed. Eukaryotic hosts could include but are not limited to 
plants, insects and yeast. Alternatively, the toxins may be 
produced in bacteria or transgenic plants in the field or in the 
20 insect by a baculovirus vector. Typically the toxins will be 
introduced to the insect by incorporating one or more of the 
toxins into the insects' feed. 

Complete lethality to feeding insects is useful but is not 
required to achieve useful toxicity. If the insects avoid the 
25 toxin or cease feeding, that avoidance will be useful in some 

applications, even if the effects are sublethal. For example, if 
insect resistant transgenic crop plants are desired, a reluctance 
of insects to feed on the plants is as useful as lethal toxicity 
to the insects since the ultimate objective is protection of the 
30 plants rather than killing the insect. 

There are many other ways in which toxins can be 
incorporated into an insect's diet. As an example, it is 
possible to adulterate the larval food source with the toxic 
protein by spraying the food with a protein solution, as 
35 disclosed herein. Alternatively, the purified protein could be 
genetically engineered into an otherwise harmless bacterium, 
which could then be grown in culture, and either applied to the 
food source or allowed to r side in the soil in an area in which 
insect eradication was desirable. Also, the protein could be 
40 genetically engineered directly into an insect food source. For 

-10- 
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be transformed during dedif f erent iac ion using appropriate 
techniques within the skill of an artisan. 

Another variable is the choice of a selectable marker. The 
preference for a particular marker is at the discretion of the 
5 artisan, but any of the following selectable markers may be used 
along with any other gene not listed herein which could function 
as a selectable marker. Such selectable markers include but are 
not limited to aminoglycoside phosphotransferase gene of 
transposon Tn5 (Aph II) which encodes resistance to the 

10 antibiotics kanamycin, neomycin and G418, as well as those genes 
which code for resistance or tolerance to glyphosate; hygromycin; 
methotrexate; phosphinothricin (bialophos ) ; imidazol inones , 
sulfonylureas and triazolopyrimidine herbicides, such as 
chlorosulfuron; bromoxynil, dalapon and the like. 

15 In addition to a selectable marker, it may be desirous to 

use a reporter gene. In some instances a reporter gene may be 
used without a selectable marker. Reporter genes are genes which 
are typically not present or expressed in the recipient organism 
or tissue. The reporter gene typically encodes for a protein 

20 which provides for some phenotypic change or enzymatic property. 
Examples of such genes are provided in K. Weising et al. Ann. 
Rev. Genetics, 22, 421 (1988), which is incorporated herein by 
reference. A preferred reporter gene is the glucuronidase (GUS) 
gene . 

25 Regardless of transformation technique, the gene is 

preferably incorporated into a gene transfer vector adapted to 
express the Photorhabdus toxins in the plant cell by including in 
the vector a plant promoter. In addition to plant promoters, 
promoters from a variety of sources can be used efficiently in 

30 plant cells to express foreign genes. For example, promoters of 
bacterial origin, such as the octopine synthase promoter, the 
nopaline synthase promoter, the mannopine synthase promoter; 
promoters of viral origin, such as the cauliflower mosaic virus 
OSS and 19S)and the like may be used. Plant promoters include, 

35 but are not limited to ribulose-1 , 6-bisphosphate (RUBP) 

carboxylase small subunit (ssu) , beta-conglycinin promoter, 
phaseolin promoter, ADH promoter, heat-shock promoters and tissue 
specific promoters. Promoters may also contain certain enhancer 
sequence elements that may improve the transcription efficiency. 

40 Typical enhancers include but are not limited to Adh-intron 1 and 
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Adh-mtron 6. Constitutive promoters may be used. Constitutive 
promoters direct continuous gene expression in ail ceils types 
and at all times (e.g.. actin, ubiquitin, CaMV 35S). Tissue 
specific promoters are responsible for gene expression in 
5 specific cell or tissue types, such as the leaves or seeds (e.g., 
zein, oleosin, napin, ACP) and these promoters may also be used. 
Promoters may also be are active during a certain stage of the 
plants' development as well as active in plant tissues and 
organs. Examples of such promoters include but are not limited 
10 to pollen-specific, embryo specific, corn silk specific, cotton 
fiber specific, root specific, seed endosperm specific promoters 
and the like. 

Under certain circumstances it may be desirable to use an 
inducible promoter. An inducible promoter is responsible for 

15 expression of genes in response to a specific signal, such as: 
physical stimulus (heat shock genes); light (RUBP carboxylase); 
hormone (Em); metabolites; and stress. Other desirable 
transcription and translation elements that function in plants 
may be used. Numerous plant-specific gene transfer vectors are 

20 known to the art. 

In addition, it is known that to obtain high expression of 
bacterial genes in plants it is preferred to reengineer the 
bacterial genes so that they are more efficiently expressed in 
the cytoplasm of plants. Maize is one such plant where it is 

25 preferred to reengineer the bacterial gene(s) prior to 

transformation to increase the expression level of the toxin in 
the plant. One reason for the reengineering is the very low G+C 
content of the native bacterial genets) (and consequent skewing 
towards high A+T content). This results in the generation of 

30 sequences mimicking or duplicating plant gene control sequences 
that are known to be highly A+T rich. The presence of some A+T- 
rich sequences within the DNA of the gene(s) introduced into 
plants (e.g., TATA box regions normally found in gene promoters) 
may result in aberrant transcription of the genets). On the 

35 other hand, the presence of other regulatory sequences residing 
in the transcribed mRNA (e.g., polyadeny lation signal sequences 
(AAUAAA.) , or s qu nces complementary to small nuclear RNAs 
involved in pre-mRNA splicing) may lead to RNA instability. 
Therefore, one goal in the d sign of reengineered bacterial 
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gene(3), more preferably referred to as plant optimized gene i s i , 
is to generate a DMA sequence having a higher G+C content, and 
preferably one close to that of plant genes coding for metabolic 
enzymes. Another goal in the design of the plant optimized 
5 genets) is to generate a DNA sequence that not only has a higher 
G+C content, but by modifying the sequence changes, should be 
made so as to not hinder translation. 

An example of a plant that has a high G+C content is maize. 
The table below illustrates how high the G+C content is in maize. 
As in maize, it is thought that G+C content in other plants is 
also high. 

Table 1 

Compilation of G+C contents of protein coding regions 

of maize genes 



Protein Class* 


Range %G+C 


Mean %G+C b 


Metabolic Enzymes (40) 


44.4-75. 3 


59.0 (8.0) 


Storage Proteins 






Group I (23) 


46.0-51.9 


48.1 (1.3) 


Group II (13) 


60.4-74.3 


67.5 (3.2) 


Group I + II (3 6) 


46.0-74.3 


55.1 (9.61° 


Structural Proteins (18) 


48.6-70.5 


63.6 (6.7) 


Regulatory Proteins (5) 


57.2-68.9 


62.0 (4.9) 


Uncharacterized Proteins (9) 


41.5-70.3 


64.3 (7.2) 


All Proteins (108) 


44.4-75.3 


60.8 (5.2) 



* Number of genes in class given in parentheses. 
Standard deviations given in parentheses. 
Combined groups mean ignored in calculation of 
overall mean. 



For the data in Table 1. coding regions of the genes were 
extracted from GenBank (Release 71) entries, and base 
compositions wer calculated using the MacVector™ program (IBI, 
New Haven, CT) . Intron sequences were ignored in the 
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ralculat ions . Group I and II storage protein gene sequences were 
distinguished by their marked difference in base composition. 

Due to the plasticity afforded by the redundancy of the 
genetic code (i.e., some amino acids are specified by more than 
5 one codon) , evolution of the genomes of different organisms or 
classes or organisms has resulted in differential usage of 
redundant codons. This 'codon bias* is reflected in the mean base 
composition of protein coding regions. For example, organisms 
with relatively low G+C contents utilize codons having A or T in 

10 the third position of redundant codons, whereas those having 
higher G+C contents utilize codons having G or C in the third 
position. It is thought that the presence of "minor" codons 
within a gene's mRNA may reduce the absolute translation rate of 
that mRNA, especially when the relative abundance of the charged 

15 tRNA corresponding to the minor codon is low. An extension of 
this is that the diminution of translation rate by individual 
minor codons would be at least additive for multiple minor 
codons. Therefore. mRNAs having high relative contents of minor 
codons would have correspondingly low translation rates. This 

20 rate would be reflected by the synthesis of low levels of the 
encoded protein. 

In order to reengineer the bacterial genets) , the codon bias 
of the plant is determined. The codon bias is the statistical 
codon distribution that the plant uses for coding its proteins. 

25 After determining the bias, the percent frequency of the codons 
in the gene(s) of interest is determined. The primary codons 
preferred by the plant should be determined as well as the second 
and third choice of preferred codons. The amino acid sequence of 
the protein of interest is reverse translated so that the 

30 resulting nucleic acid sequence codes for the same protein as the 
native bacterial gene, but the resulting nucleic acid sequence 
corresponds to the first preferred codons of the desired plant. 
The new sequence is analyzed for restriction enzyme sites that 
might have been created by the modification. The identified 

35 sites are further modified by replacing the codons with second or 
third choice preferred codons. Other sites in the sequence which 
could affect the transcription or translation of the gene of 
interest are the exon:intron 5' or 3' junctions, poly A addition 
signals, or RNA polymerase termination signals. The sequence is 
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further analyzed and modified to reduce the frequency of TA or GC 
doublets. In addition to the doublets, G or C sequence blocks 
that have more than about four residues that are the same can 
affect transcription of the sequence. Therefore, these blocks 
5 are also modified by replacing the codons of first or second 
choice, etc. with the next preferred codon of choice. It is 
preferred that the plant optimized gene(s) contains about 63% of 
first choice codons, between about 22% to about 37% second choice 
codons, and between 15% and 0% third choice codons, wherein the 

10 total percentage is 100%. Most preferred the plant optimized 

gene(s) contain about 63% of first choice codons, at least about 
22% second choice codons, about 7.5% third choice codons, and 
about 7.5% fourth choice codons, wherein the total percentage is 
100%. The method described above enables one skilled in the art 

15 to modify gene ( s ) that are foreign to a particular plant so that 
the genes are optimally expressed in plants. The method is 
further illustrated in pending provisional application U.S. 
60/005,405 filed on October 13, 1995, which is incorporated 
herein by reference. 

20 Thus, in order to design plant optimized gene(s) the amino 

acid sequence of the toxins are reverse translated into a DMA 
sequence, utilizing a nonredundant genetic code established from 
a codon bias table compiled for the gene DNA sequence for the 
particular plant being transformed. The resulting DNA sequence, 

25 which is completely homogeneous in codon usage, is further 
modified to establish a DNA sequence chat, besides having a 
higher degree of codon diversity, also contains strategically 
placed restriction enzyme recognition sites, desirable base 
composition, and a lack of sequences that might interfere with 

30 transcription of the gene, or translation of the product mRNA . 

It is theorized that bacterial genes may be more easily 
expressed in plants if the bacterial genes are expressed in the 
plastids. Thus, it may be possible to express bacterial genes in 
plants, without optimizing the genes for plant expression, and 
35 obtain high express of the protein. See U.S. Patent Nos . 

4,762,785; 5,451,513 and 5,545,817, which are incorporated herein 
by reference. 
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One of che issues regarding commercial exploiting transgenic 
plants is resistance management. This is of particular concern 
with Bacillus churingiensis toxins. Ther are numerous companies 
commerically exploiting Bacillus churingiensis and there has been 
5 much concern about Bt toxins becoming resistant. One scrataegy 
for insect resistant management would be to combine the toxins 
produced by Photorhabdus with toxins such as Bt, vegetative 
insect proteins (Ciba Geigy) or other toxins. The combinations 
could be formulated for a sprayable application or could be 
10 molecular combinations. Plants could be transformed with 

Photorhabdus genes that produce insect toxins and other insect 
toxin genes such as Be as with other insect toxin genes such as 
BC. 

European Patent Application 0400246A1 describes 
15 transformation of 2 Bt in a plant, which could be any 2 genes. 

Another way to produce a transgenic plant that contains more than 
one insect resistant gene would be to produce two plants, with 
each plant containing an insect resistant gene. These plants 
would be backcrossed using traditional plant breeding techniques 
20 to produce a plant containing more than one insect resistant 
gene. 

In addition to producing a transformed plant containing 
plant optimized gene(s), there are other delivery systems where 
it may be desirable to reengineer the bacterial gene(s). Along 

25 the same lines, a genetically engineered, easily isolated protein 
toxin fusing together both a molecule attractive to insects as a 
food source and the insecticidal activity of the toxin may be 
engineered and expressed in bacteria or in eukaryotic cells using 
standard, well-known techniques. After purification in the 

30 laboratory such a toxic agent with "built-in" bait could be 
packaged inside standard insect trap housings. 

Another delivery scheme is the incorporation of the genetic 
material of toxins into a baculovirus vector. Baculoviruses 
infect particular insect hosts, including those desirably 

35 targeted with the Photorhabdus toxins. Infectious baculovirus 
harboring an expression construct for the Photorhabdus toxins 
could be introduced into areas of insect infestation to thereby 
intoxicate or poison infected insects. 
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Transfer of the insecticidal properties requires nucleic 
acid sequences encoding the coding the amino acid sequences for 
the Phocorhabdus toxins integrated into a protein expression 
vector appropriate to the host in which the vector will reside. 
5 One way to obtain a nucleic acid sequence encoding a protein with 
insecticidal properties is to isolate the native genetic material 
which produces the toxins from Phocorhabdus , using information 
deduced from the toxin's amino acid sequence, large portions of 
which are set forth below. As described below, methods of 
10 purifying the proteins responsible for toxin activity are also 
disclosed. 

Using N-terminal amino acid sequence data, such as set forth 
below, one can construct oligonucleotides complementary to all. 
or a section of, the DNA bases that encode the first amino acids 

15 of the toxin. These oligonucleotides can be radiolabeled and 
used as molecular probes to isolate the genetic material from a 
genomic genetic library built from genetic material isolated from 
strains of Phocorhabdus . The genetic library can be cloned in 
plasmid, cosmid, phage or phagemid vectors. The library could be 

20 transformed into Escherichia coli and screened for toxin 

production by the transformed cells using antibodies raised 
against the toxin or direct assays for insect toxicity. 

This approach requires the production of a battery of 
oligonucleotides, since the degenerate genetic code allows an 

25 amino acid to be encoded in the DNA by any of several three- 

nucleotide combinations. For example, the amino acid arginine 
can be encoded by nucleic acid triplets CGA, CGC, CGG , CGT, AGA. 
and AGG. Since one cannot predict which triplet is used at those 
positions in the toxin gene, one must prepare oligonucleotides 

30 with each potential triplet represented. More than one DNA 

molecule corresponding to a protein subunit may be necessary to 
construct a sufficient number of oligonucleotide probes to 
recover all of the protein subunits necessary to achieve oral 
toxicity . 

35 From the amino acid sequence of the purified protein, 

genetic materials responsible for the production of toxins can 
readily be isolated and cloned, in whole or in part, into an 
expression vector using any of several techniques well-known to 
one skilled in the art of molecular biology. A typical 

40 expression vector is a DNA plasmid, though other transfer means 
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including, but not limited to. cosmids, phagemids and phage are 
also envisioned. In addition to features required or desired for 
plasmid replication, such as an origin of replication and 
antibiotic resistance or other form of a selectable marker such 
5 as the bar gene of Streptomyces hygroscopicus or 

viridochromogenes , protein expression vectors normally 
additionally require an expression cassette which incorporates 
the cis-acting sequences necessary for transcription and 
translation of the gene of interest. The cis-acting sequences 
required for expression in prokaryotes differ from those required 
in eukaryotes and plants. 

A eukaryotic expression cassette requires a transcriptional 
promoter upstream (5') to the gene of interest, a transcriptional 
termination region such as a poly-A addition site, and a ribosome 
15 binding site upstream of the gene of interest's first codon. In 
bacterial cells, a useful transcriptional promoter that could be 
included in the vector is the T7 RNA Polymerase-binding promoter. 
Promoters, as previously described herein, are known to 
efficiently promote transcription of mRNA. Also upstream from 
20 the gene of interest the vector may include a nucleotide sequence 
encoding a signal sequence known to direct a covalently linked 
protein to a particular compartment of the host cells such as the 
cell surface. 

Insect viruses, or baculoviruses , are known to infect and 

25 adversely affect certain insects. The affect of the viruses on 
insects is slow, and viruses do not stop the feeding of insects. 
Thus viruses are not viewed as being useful as insect pest 
control agents. Combining the Phoeorhabdus toxins genes into a 
baculovirvis vector could provide an efficient way of transmitting 

30 the toxins while increasing the lethality of the virus. In 

addition, since different baculoviruses are specific to different 
insects, it may be possible to use a particular toxin to 
selectively target particularly damaging insect pests. A 
particularly useful vector for the toxins genes is the nuclear 

35 polyhedrosis virus. Transfer vectors using this virus have been 
described and are now the vectors of choice for transferring 
foreign genes into insects. The virus-toxin gene recombinant may 
be constructed in an orally transmissible form. Baculoviruses 
normally infect insect victims through the mid-gut intestinal 

40 mucosa. The toxin gene inserted behind a strong viral coat 
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protein promoter would be expressed and should rapidly kill the 
infected insect. 

In addition to an insect virus or baculovirus or transgenic 
plant delivery system for the protein toxins of the present 
5 invention, the proteins may be encapsulaced using Bacillus 

zhuringiensis encapsulation technology such as but not limited to 
U.S. Patent Nos . 4,695,455; 4,695,462; 4,861,595 which are all 
incorporated herein by reference. Another delivery system for 
the protein toxins of the present invention is formulation of the 

10 protein into a bait matrix, which could then be used in above and 
below ground insect bait stations. Examples of such technology 
include but are not limited to PCT Patent Application WO 
93/23998, which is incorporated herein by reference. 

As is described above, it might become necessary to modify 

15 the sequence encoding the protein when expressing it in a non- 
native host, since the codon preferences of other hosts may 
differ from that of Phocorhabdus . In such a case, translation 
may be quite inefficient in a new host unless compensating 
modifications to the coding sequence are made. Additionally, 

20 modifications to the amino acid sequence might be desirable to 
avoid inhibitory cross-reactivity with proteins of the new host, 
or to refine the insecticidal properties of the protein in the 
new host . A genetically modified toxin gene might encode a toxin 
exhibiting, for example, enhanced or reduced toxicity, altered 

25 insect resistance development, altered stability, or modified 
target species specificity. 

In addition to the Phocorhabdus genes encoding the toxins, 
the scope of the present invention is intended to include related 
nucleic acid sequences which encode amino acid biopolymers 

30 homologous to the toxin proteins and which retain the toxic 

effect of the Photorhabdus proteins in insect species after oral 
ingestion . 

For instance, the toxins used in the present invention seem 
to first inhibit larval feeding before death ensues. By 

35 manipulating the nucleic acid sequence of Photorhabdus toxins or 
its controlling sequences, genetic engineers placing the toxin 
gene into plants could modulate its potency or its mode of action 
to, for example, keep the eating- inhibitory activity while 
eliminating the absolute toxicity to the larvae. This change 

40 could permit the transformed plant to survive until harvest 
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without having the unnecessarily dramatic effect on the ecosystem 
of wiping out all target insects. All such modifications of the 
gene encoding the toxin, or of the protein encoded by the gene, 
are envisioned to fall within the scope of the present invention. 
5 Other envisioned modifications of the nucleic acid include 

the addition of targeting sequences to direct the toxin to 
particular parts of the insect larvae for improving its 
efficiency . 

Strains ATCC 55397, 43948, 43949, 43950, 43951, 43952 have 
been deposited in the American Type Culture Collection. 12301 
Parklawn Drive, Rockville, MD 20852 USA. Amino acid and 
nucleotide sequence data for the W-14 native toxin (ATCC 553 97) 
is presented below. Isolation of the genomic DNA for the toxins 
from the bacterial hosts is also exemplified herein. 

Standard and molecular biology techniques were followed and 
taught in the specification herein. Additional information may 
be found in Sambrook. J.. Fritsch, E. F . , and Maniatis/T. 
(1989), Molecular Cloning, A Laboratory Manual, Cold Spring 
Harbor Press, which is incorporated herein by reference. 

20 

The following abbreviations are used throughout the Examples: 
Tris = tris (hydroxymethy 1 ) amino methane; SDS = sodium dodecyl 
sulfate; EDTA = ethy lenediaminetetraacetic acid, IPTG = 
isopropylthio-B-galactoside, X-gal = 5-bromo-4-chloro-3-indoyl-B- 
D-galactoside, CTAB = cety ltrimethylammonium bromide; kbp = 
kilobase pairs; dATP , dCTP, dGTP, dTTP , I = 2 • -deoxynucleos ide 
5 ' -triphosphates of adenine, cytosine. guanine, thymine, and 
inosine, respectively; ATP = adenosine 5* triphosphate. 

Example 1 

Purification of toxin from P. luminescens and Demonstration of 
toxicity after oral delivery of purified toxin 

The insecticidal protein toxin of the present invention was 
35 purified from P. luminescens strain W-14. ATCC Accession Number 
55397. stock cultures of P. luminescens were maintained on petri 
dishes containing 2% Proteose Peptone No. 3 (i.e., PP3 , Difco 
Laboratories, Detroit MI) in 1.5% agar, incubated at 25°C and 
transferred weekly. Colonies of the primary form of the bacteria 
40 were inoculated into 200 ml of PP3 broth supplemented with 0.5% 
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polyoxyethylene sorbitan mono-stearate (Tw en 60, Sigma Chemical 
Company, St. Louis MO) in a one liter flask. The broth cultures 
were grown for 72 hours at 30°C on a rotary shaker. The toxin 
proteins can be recovered from cultures grown in the presence or 
5 absence of Tween; however, the absence of Tween can affect the 
form of the bacteria grown and the profile of proteins produced 
by the bacteria. In the absence of Tween, a variant shift occurs 
insofar as the molecular weight of at least one identified toxin 
subunit shifts from about 200 kDa to about 185 kDa . 
10 The 72 hour cultures were centrifuged at 10,000 x g for 30 

minutes to remove cells and debris. The supernatant fraction 
that contained the insecticidal activity was decanted and brought 
to 50 mM K2HPO4 by adding an appropriate volume of 1.0 M KjHPO<. 
The pH was adjusted to 8.6 by adding potassium hydroxide. This 
15 supernatant fraction was then mixed with DEAE-Sephacel (Pharmacia 
LKB Biotechnology) which had been equilibrated with 50 mM K?HPO«. 
The toxic activity was adsorbed to the DEAE resin. This mixture 
was then poured into a 2.6 x 40 cm column and washed with 50 mM 
KjHPO« at room temperature at a flow rate of 3 0 ml/hr until the 
20 effluent reached a steady baseline UV absorbance at 2 80 nm. The 
column was then washed with 150 mM KC1 until the effluent again 
reached a steady 280 nm baseline. Finally the column was washed 
with 300 mM KC1 and fractions were collected. 

Fractions containing the toxin were pooled and filter 
25 sterilized using a 0.2 micron pore membrane filter. The toxin 
was then concentrated and equilibrated to 100 mM KPO,, pH 6.9, 
using an ultrafiltration membrane with a molecular weight cutoff 
of 100 kDa at 4°C (Centriprep 100. Amicon Division-W. R . Grace and 
Company) . A 3 ml sample of the toxin concentrate was applied to 
30 the top of a 2.6 x 95 cm Sephacryl S-400 HR gel filtration column 
(Pharmacia LKB Biotechnology) . The eluent buffer was 100 mM KPO«, 
pH 6.9, which was run at a flow rate of 17 ml/hr, at 4°C. The 
effluent was monitored at 280 nm. 

Fractions were collected and tested for toxic activity. 
35 Toxicity of chromatographic fractions was examined in a 

biological assay using Manduca sexca larvae. Fractions were 
either applied directly onto the insect diet (Gypsy moth wheat 
germ diet, ICN Biochemicals Division - ICN Biomedicals, Inc.) or 
administered by intrahemocelic injection of a 5 ul sample through 
40 the first proleg of 4th or 5th instar larva using a 30 gauge 
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needle. The weight of each larva within a treatment group was 
recorded at 24 hour intervals. Toxicity was presumed if the 
insect ceased feeding and died within several days of consuming 
treated insect diet or if death occurred within 24 hours after 
5 injection of a fraction. 

The toxic fractions were pooled and concentrated using the 
Centriprep-100 and were then analyzed by HPLC using a 7.5 mm x 60 
cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium 
phosphate, pH 6.9 eluent buffer running at 0.4 ml/min. This 
analysis revealed the toxin protein to be contained within a 
single sharp peak that eluted from the column with a retention 
time of approximately 33.6 minutes. This retention time 
corresponded to an estimated molecular weight of 1,000 kDa. Peak 
fractions were collected for further purification while fractions 
not containing this protein were discarded. The peak eluted from 
the HPLC absorbs UV light at 218 and 2 80 nm but did not absorb at 
405 nm. Absorbance at 405 nm was shown to be an attribute of 
xenorhabdin antibiotic compounds. 

Electrophoresis of the pooled peak fractions in a non- 
denaturing agarose gel {Metaphor Agarose, FMC BioProducts) showed 
that two protein complexes are present in the peak. The peak 
material, buffered in 50 mM Tris-HCl. pH 7.0, was separated on a 
1.5% agarose stacking gel buffered with 100 mM Tris-HCl at pH 7.0 
and 1.9% agarose resolving gel buffered with 200 mM Tris-borate 
at pH 8.3 under standard buffer conditions (anode buffer 1M Tris- 
HCl, pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycine). The 
gels were run at 13 mA constant current at 15°C until the phenol 
red tracking dye reached the end of the gel. Two protein bands 
were visualized in the agarose gels using Coomassie brilliant 
30 blue staining. 

The slower migrating band was referred to as "protein band 
1" and faster migrating band was referred to as "protein band 2." 
The two protein bands were present in approximately equal 
amounts. The Coomassie stained agarose gels were used as a guide 
to precisely excise the two protein bands from unstained portions 
of the gels. The excised pieces containing the protein bands 
were macerated and a small amount of sterile water was added. As 
a control, a portion of the gel that contained no protein was 
also excised and treated in the same manner as the gel pieces 
40 containing the protein. Protein was recov red from the gel 
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pieces by electroelucion into 100 mM Tris-borate pH 3.3, at loo 
volts (constant voltage) for two hours. Alternatively, proton 
was passively eluted from the gel pieces by adding an equal 
volume of 50 mM Tris-HCl, pH 7.o, to the gel pieces, then 
5 incubating at 3 0°C for 16 hours. This allowed the protein to 
diffuse from the gel into the buffer, which was then collected. 

Results of insect toxicity tests using HPLC-purif ied toxin 
(33.6 min. peak) and agarose gel purified toxin demonstrated 
toxicity of the extracts. Injection of 1.5 ug of the HPLC 

10 purified protein kills within 24 hours. Both protein bands 1 and 
2, recovered from agarose gels by passive elution or 
electroelution, were lethal upon injection. The protein 
concentration estimated for these samples was less than 50 
ng/larva. A comparison of the weight gain and the mortality 

15 between the groups of larvae injected with protein bands 1 cr 2 
indicate that protein band 1 was more toxic by injection 
delivery . 

When HPLC-purified toxin was applied to larval diet at a 
concentration of 7.5 ug/ larva, it caused a halt in larval weight 

20 gain (24 larvae tested). The larvae begin to feed, but after 
consuming only a very small portion of the toxin treated die';, 
they began to show pathological symptoms induced by the toxin and 
the larvae cease feeding. The insect frass became discolore i and 
most larva showed signs of diarrhea. Significant insect 

25 mortality resulted when several 5 ug toxin doses were applied to 
the diet over a 7-10 day period. 

Agarose-separated protein band 1 significantly inhibited 
larval weight gain at a dose of 200 ng/larva. Larvae fed similar 
concentrations of protein band 2 were not inhibited and gained 

30 weight at the same rate as the control larvae. Twelve larva- 

were fed eluted protein and 45 larvae were fed protein-containing 
agarose pieces. These two sets of data indicate that protein 
band 1 was orally toxic to Manduca sexta. in this experiment it 
appeared that protein band 2 was not toxic to Manduca sexta. 

35 Further analysis of protein bands 1 and 2 by SDS-PAGE under 

denaturing conditions showed that each band was composed of 
several smaller protein subunits. Proteins were visualized by 
Coomassie brilliant blue staining followed by silver staining to 
achieve maximum sensitivity. 
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The protein subunits in the two bands were very similar. 
Protein band i contains 8 protein subunits of 25.1, 56.2, 60.3, 
65.6, 166, 171, 184 and 208 kDa . Protein band 2 had an identical 
profile except that the 25.1, 60.8, and 65.6 kDa proteins were 
5 not present. The 56.2, 60.8, 65.6, and 184 kDa proteins were 
present in the complex of protein band 1 at approximately equal 
concentrations and represent 80% or more of the total protein 
content of that complex. 

The native HPLC-purif ied toxin was further characterized as 

10 follows. The toxin was heat labile in that after being heated to 
60°C for 15 minutes it lost its ability to kill or to inhibit 
weight gain when injected or fed to M. sexca larvae. Assays were 
designed to detect lipase, type C phospholipase, nuclease or red 
blood cell hemolysis activities and were performed with purified 

15 toxin. None of these activities were present. Antibiotic zone 
inhibition assays were also done and the purified toxin failed to 
inhibit growth of Gram-negative or -positive bacteria, yeast or 
filamentous fungi, indicating that the toxic is not a xenorhabdin 
antibiotic . 

20 The native HPLC-purif ied toxin was tested for ability to 

kill insects other than Manduca sexta. Table 2 lists insects 
killed by the HPLC-purif ied P. luminescens toxin in this study. 



25 



Table 2 

Insects Killed by P. luminescens Toxin 



Common Name 

30 Tobacco 
horn worm 



Order 



Genus and 
species 



Lepidoptera Manduca sexta 



Route of 
Delivery 

Oral and 
injected 



Mealworm 

35 Pharaoh ant 

German 
cockroach 

40 Mosquito 



Coleoptera 

Hymenoptera 

Dictyoptera 

Diptera 



Tenebcio molitor 
Monomorium pharoanis 
Blactella germanica 

Aedes aegypti 



Oral 
Oral 

Oral and 
injected 

Oral 
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The Photorhabdus luminescens utility and toxicity were 
5 further characterized. Photorhabdus luminescens (strain W-l<i; 

culture broth was produced as follows. The production medium was 
2% Bacto Proteose Peptone* Number 3 (PP3, Difco Laboratories, 
Detroit, Michigan) in Milli-Q'* deionized water. Seed culture 
flasks consisted of 175 ml medium placed in a 500 ml tribaffied 

10 flask with a Delong neck, covered with a Kaput and autoclaved 
for 20 minutes, T=250°F. Production flasks consisted of 500 mis 
in a 2.8 liter 500 ml tribaffied flask with a Delong neck, 
covered by a Shin-etsu silicon foam closure. These were 
autoclaved for 45 minutes, T=250'F. The seed culture was 

15 incubated at 28°C at 150 rpm in a gyrotory shaking incubator .vith 
a 2 inch throw. After 16 hours of growth, 1% of the seed culture 
was placed in the production flask which was allowed to grow cor 
24 hours before harvest. Production of the toxin appears to be 
during log phase growth. The microbial broth was transferred to 

20 a 1L centrifuge bottle and the cellular biomass was pelleted i30 
minutes at 2500 RPM at 4°C, [R.C.F. = -1600) HG-4L Rotor RC3 
Sorval centrifuge, Dupont , Wilmington, Delaware). The prima i-y 
broth was chilled at 4°C for 8-16 hours and recentrif uged at 
least 2 hours (conditions above) to further clarify the broth by 

25 removal of a putative mucopolysaccharide which precipitated upon 
standing. (An alternative processing method combined both rteps 
and involved the use of a 16 hour clarification centrifugation, 
same conditions as above.) This broth was then stored at 4'C 
prior to bioassay or filtration. 

30 Photorhabdus culture broth and protein toxin (s) purified 

from this broth showed activity (mortality and/or growth 
inhibition, reduced adult emergence) against a number of insects. 
More specifically, the activity is seen against corn rootworm 
(larvae and adult), Colorado potato beetle, and turf grubs, which 

35 are members of the insect order Coleopcera . Other members of the 
Coleopcera include wireworms, pollen beetles, flea beetles, seed 
beetles and weevils. Activity has also been observed against 
aster leaf hopper, which is a member of the order, Homopcera. 
Other members of the Homopcera include planthoppers . pear pyslla, 

40 apple sucker, scale insects, whiteflies, and spittle bugs, as 
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well as numerous host specific aphid species. The broth and 
purified fractions are also active against beet armyworm. cabbage 
looper, black cutworm, tobacco budworm, European corn borer, corn 
earworm, and codling moth, which are members of the order 
5 Lepidopcera . Other typical members of this order are clothes 
moth, Indian mealmoth. leaf rollers, cabbage worm, cotton 
bollworm, bagworm, Eastern tent caterpillar, sod webworm, and 
fall armyworm. Activity is also seen against fruitfly and 
mosquito larvae, which are members of the order Dipcera. Other 

10 members of the order Diptera are pea midge, carrot fly, cabbage 
root fly, turnip root fly. onion fly, crane fly, house fly, and 
various mosquito species. Activity is seen against carpenter ant 
and Argentine ant, which are members of the order that also 
includes fire ants, oderous house ants, and little black ants. 

*5 The broth/fraction is useful for reducing populations of 

insects and were used in a method of inhibiting an insect 
population. The method may comprise applying to a locus of the 
insect an effective insect inactivating amount of the active 
described. Results are reported in Table 3. 

20 Activity against corn rootworm larvae was tested as follows. 

Photorhabdus culture broth (filter sterilized, cell- free) or 
purified HPLC fractions were applied directly to the surface 
(-1.5 cm 2 ) of 0.25 ml of artificial diet in 30 fil aliquots 
following dilution in control medium or 10 mM sodium phosphate 

25 buffer, pH 7.0, respectively. The diet plates were allowed to 
air-dry in a sterile flow-hood and the wells were infested with 
single, neonate Diabrotica undecimpunccaca howardi (Southern corn 
rootworm, SCR) hatched from sterilized eggs, with second instar 
SCR grown on artificial diet or with second instar Diabrocica 

30 virgifera virglfera (Western corn rootworm, WCR) reared on corn 
seedlings grown in Metromix*. Second instar larvae were weighed 
prior to addition to the diet. The plates were sealed, placed in 
a humidified growth chamber and maintained at 27°c for the 
appropriate period (4 days for neonate and adult SCR, 2-5 days 

35 for WCR larvae, 7-14 days for second instar SCR). Mortality and 
weight determinations were scored as indicated. Generally, 16 
insects per treatment were used in all studies. Control 
mortalities were as follows: neonate larvae, <5%, adult beetles, 
5% . 
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Activity against Colorado potato beetle was tested as 
follows. Phocorhabdus culture broth or control medium was applied 
to the surface (-2.0 cm : ) of 1.5 ml of standard artificial diet 
held in the wells of a 24-well tissue culture plate. Each well 
5 received 50 ul of treatment and was allowed to air dry. 

Individual second instar Colorado potato beetle (Lepc inocarsa 
decemlineaca , CPB) larvae were then placed onto the diet and 
mortality was scored after 4 days. Ten larvae per treatment were 
used in all studies. Control mortality was 3.3%. 
10 Activity against Japanese beetle grubs and beetles was 

tested as follows. Turf grubs (Popillia japonica, 2-3rd instar) 
were collected from infested lawns and maintained in the 
laboratory in soil/peat mixture with carrot slices added as 
additional diet. Turf beetles were pheromone- trapped locally and 
15 maintained in the laboratory in plastic containers with maple 

leaves as food. Following application of undiluted Phocorhabdus 
culture broth or control medium to corn rootworm artificial diet 
(30 nl/1.54 cm 2 , beetles) or carrot slices (larvae), both stages 
were placed singly in a diet well and observed for any mortality 
20 and feeding. In both cases there was a clear reduction in the 
amount of feeding (and feces production) observed. 

Activity against mosquito larvae was tested as follows. The 
assay was conducted in a 96-well microtiter plate. Each well 
contained 200 ill of aqueous solution ( Photorhabdus culture broth, 
25 control medium or H 2 0) and approximately 20, 1-day old larvae 

{Aedes aegypti) . There were 6 wells per treatment. The results 
were read at 2 hours after infestation and did not change over 
the three day observation period. No control mortality was seen. 
Activity against fruitflies was tested as follows. 
30 Purchased Drosophila melanogaster medium was prepared using 50% 
dry medium and a 50% liquid of either water, control medium or 
Phocorhabdus culture broth. This was accomplished by placing 
8.0 ml of dry medium in each of 3 rearing vials per treatment and 
adding 8.0 ml of the appropriate liquid. Ten late instar 
35 Drosophila melanogaster maggots were then added to each vial. 
The vials were held on a laboratory bench, at room temperature, 
under fluorescent ceiling lights. Pupal or adult counts were 
made after 3, 7 and 10 days of exposure. Incorporation of 
Phocorhabdus culture broth into the diet media for fruitfly 
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through a hole in the shoe glass lid. All treatments contained 
5% sucrose. Bioassays were held in the dark at room temperature 
and graded at 19 days. Control mortality was 9%. Assays 
delivering purified fractions utilized artificial ant diet mixed 
5 with the treatment (purified fraction or control solution) at a 
rate of 0.2 ml treatment/2.0 g diet in a plastic test tube. The 
final protein concentration of the purified fraction was less 
than 10 ng/g diet. Ten ants per treatment, a water source, 
harborage and the treated diet were placed in sealed plastic 
10 containers and maintained in the dark at 27°c in a humidified 

incubator. Mortality was scored at day 10. No control mortality 
was seen. 

Activity against various lepidopteran larvae was tested as 
follows. Phocorhabdus culture broth or purified fractions were 

15 applied directly to the surface (-1.5 cm 2 ) of 0.25 ml of standard 
artificial diet in 30 jil aliquots following dilution in control 
medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. 
The diet plates were allowed to air-dry in a sterile flow-hood 
and the wells were infested with single, neonate larva. European 

20 corn borer {Oscrinia nubilalis) and corn earworm (Helicoverpa 

zea) eggs were supplied from commercial sources and hatched in- 
house, whereas beet armyworm (Spodopcera exigua) , cabbage looper 
ITrichcrplusia ni) , tobacco budworm {Heliothis virescens) , codling 
moth (Laspeyresia pomonella) and black cutworm {Agrotis ipsilon) 

25 larvae were supplied internally. Following infestation with 
larvae, the diet plates were sealed, placed in a humidified 
growth chamber and maintained in the dark at 2 7°C for the 
appropriate period. Mortality and weight determinations were 
scored at days 5-7 for Phocorhabdus culture broth and days 4-7 

30 for the purified fraction. Generally, 16 insects per treatment 
were used in all studies. Control mortality ranged from 4-12.5% 
for control medium and was less than 10% for phosphate buffer. 
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Table 3 

Effect of Phocorhabdus iuminescens (strain W-14i 
Culture Broth and Purified Toxin Fraction on Mortality and Growth 
Inhibition of Different Insect Orders/Species 



Insect Order/ Species 


Broth 


Purified 


Fract ion 




% Mort. 


% G.I. 


% Mort. 


% G.I. 


COLEOPTERA 










Corn Rootworm 










Southern/neonate larva 


100 


na 


100 


na 


Southern/2 nd instar 


na 


38.5 


nt 


nt 


Southern/adult 


45 


nt 


nt 


nt 


Western/2'"' instar 


na 


35 


nt 


nt 


Colorado Potato 










Beetle 


93 


nt 


nt 


nt 


2 nd instar 










Turf Grub 


na 


a.f . 


nt 


nt 


3 ld instar 


na 


a. f . 


nt 


nt 


adult 










DIPTBRA 










Fruit Fly (adult 


17 


nt 


nt 


nt 


emergence ) 


100 


na 


nt 


nt 


Mosquito larvae 










HOMOPTBRA 










Aster Leafhopper 


96.5 


na 


100 


na 


HYMENOPTBRA 










Argentine Ant 


75 


na 


nt 


na 


Carpenter Ant 


71 


na 


100 


na 


LBP IDOPTBRA 










Beet Armyworm 


12 . 5 


36 


18.75 


41.4 


Black Cutworm 


nt 


nt 


0 


71.2 


Cabbage Looper 


nt 


nt 


21.9 


66.8 


Codling Moth 


nt 


nt 


6.25 


45.9 


Corn Earworm 


56.3 


94.2 


97.9 


na 


European Corn Borer 


96.7 


98.4 


100 


na 


Tobacco Budworm 


13 . 5 


52 . 5 


19 .4 


85.6 



Mort. = mortality, G.I. = growth inhibition, 

na = not applicable, nt = not tested, a.f. = anti-feedant 
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Example B 

Insecticide Utility Upon Soil Application 

Photorhabdus luminescens (strain W-14) culture broth was 
shown to be active against corn rootworm when applied directly to 
soil or a soil-mix (Metromix*). Activity against neonate SCR and 
WCR in Metromix* was tested as follows (Table 4). The test was 
run using corn seedlings (United Agriseeds brand CL614) that were 
germinated in the light on moist filter paper for 6 days. After 
roots were approximately 3-6 cm long, a single kernel / seedling 
was planted in a 591 ml clear plastic cup with 50 gm of dry 
Metromix*. Twenty neonate SCR or WCR were then placed directly on 
the roots of the seedling and covered with Metromix*. Upon 
infestation, the seedlings were then drenched with 50 ml total 
volume of a diluted broth solution. After drenching, the cups 
were sealed and left at room temperature in the light for 7 days. 
Afterwards, the seedlings were washed to remove all Metromix* and 
the roots were excised and weighed. Activity was rated as the 
percentage of corn root remaining relative to the control plants 
20 and as leaf damage induced by feeding. Leaf damage was scored 
visually and rated as either +, + + , or with - 

representing no damage and +++ representing severe damage. 

Activity against neonate SCR in soil was tested as follows 
(Table 5) . The test was run using corn seedlings (United 
25 Agriseeds brand CL614) that were germinated in the light on moist 
filter paper for 6 days. After the roots were approximately 3-6 
cm long, a single kernel /seedling was planted in a 591 ml clear 
plastic cup with 150 gm of soil from a field in Lebanon, IN 
planted the previous year with corn. This soil had not been 
30 previously treated with insecticides. Twenty neonate SCR were 
then placed directly on the roots of the seedling and covered 
with soil. After infestation, the seedlings were drenched with 
50 ml total volume of a diluted broth solution. After drenching, 
the unsealed cups were incubated in a high relative humidity 
35 chamber (80%) at 78°F. Afterwards, the seedlings were washed to 

remove all soil and the roots were excised and weighed. Activity 
was rat d as the percentage of corn root remaining relative to 
the control plants and as leaf damage induced by feeding. Leaf 
damage was scored visually and rated as either -, +, + +, or ♦ + - 
40 with - representing no damage and +++ representing severe damage. 
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Table 4 

Effect of Phocorhabdus luminescens (strain w-14) Culture 
Broth on Rootworm Larvae After Post- Infestation Drenching 

(Metromix*) 

Treatment Larvae Leaf Damage Root Weight (g) % 



Southern Corn Rootworm 

10 Water - - 0.4916 ± 0.023 100 

Medium (2.0% v/v) - - 0.4416 ± 0.029 100 

Broth (6.25%v/v) - - 0.4641 ± 0.081 100 

Water + 0.1410 ± 0.006 28.7 

15 Media (2.0% v/v) + + + + 0.1345 ± 0.028 30.4 

Broth (1.56% v/v) + - 0.4830 ± 0.031 104 

Western Corn Rootworm 

20 Water - - 0.4446 ± 0.019 100 

Broth (2.0% v/v) - - 0.4069 ± 0.026 100 

Water + - 0.2202 ± 0.015 49 

Broth (2.0% v/v) + - 0.3879 + 0.013 95 



Table 5 

Effect of Phocorhabdus luminescens (strain W-14) Culture Broth on 
Southern Corn Rootworm Larvae After Post-Infestation Drenching 
30 (Soil) 

Treatment Larvae Leaf Damage Root Weight (g) \ 

Water - - 0.2148 ± 0.014 100 

35 Broth (50% v/v) - - 0.2260 ± 0.016 103 

Water + 0.0916 ± 0.009 43 

Broth (50% v/v) + - 0.2428 ± 0.032 113 



40 Activity of Photorhajbdus luminescens (strain W-14) culture 

broth against second instar turf grubs in Metromix* was observed 
in tests conducted as follows (Table 6). Approximately 50 gm of 
dry Metromix* was added to a 591 ml clear plastic cup. The 
Metromix* was then drenched with 50 ml total volume of a 50% (v/v) 

45 diluted Photorhabdus broth solution. The dilution of crude broth 
was made with water, with 50% broth being prepared by adding 2 5 
ml of crude broth to 25 ml of water for 50 ml total volume. A 1 % 
(w/v) solution of proteose peptone #3 ( PP3 ) , which is a 50% 
dilution of th normal media concentration, was used as a broth 

50 control. After drenching, five second instar turf grubs were 
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placed on the top of the moistened Metromix*. Healthy turf grub 
larvae burrowed rapidly into the Metromix'*. Those larvae that -ii 
not burrow within lh were removed and replaced with fresh larvae 
The cups were sealed and placed in a 28°C incubator, in the dark 
5 After seven days, larvae were removed from the Metromix* and 
scored for mortality. Activity was rated the percentage of 
mortality relative to control. 



10 Table 6 

Effect of Phocorhabdus luminescens (strain W-14) Culture Broth o 
Turf Grub After Pre- Infestation Drenching (Metromix*) 



15 



20 



25 



Treatment Mortality* Mortality % 

Water 7/15 47 
Control medium 

(1.0% w/v) 12/19 63 
Broth 

(50% v/v) 17/20 85 
♦expressed as a ratio of dead/living larvae 

Example 4 

Insecticide Utility Upon Leaf Application 



30 Activity of Phocorhabdus broth against European corn borer 

was seen when the broth was applied directly to the surface of 
maize leaves (Table 7). m these assays Phocorhabdus broth was 
diluted 100- fold with culture medium and applied manually to the 
surface of excised maize leaves at a rate of -6.0 nl/cm~ of leaf 

35 surface. The leaves were air dried and cut into equal sized 
strips approximately 2x2 inches. The leaves were rolled, 
secured with paper clips and placed in 1 oz plastic shot glasses 
with 0.25 inch of 2% agar on the bottom surface to provide 
moisture. Twelve neonate European corn borers were then placed 

40 onto the rolled leaf and the cup was sealed. After incubation 
for 5 days at 27°C in the dark, the samples were scored for 
feeding damage and recovered larvae. 
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Table 7 

Effect of Photorhabdus luminescens (strain W-i4> Culture Broth on 
European Corn Borer Larvae Following Pre- Inf estat ion Application 

to Excised Maize Leaves 

5 

Treatment Leaf Damage Larvae Recovered Weight (mg) 

Water Extensive 55/120 0.42 mg 

Control Medium Extensive 40/120 0.50 mg 

Broth (1.0% v/v) Trace 3/120 0.15 mg 

10 

Activity of the culture broth against neonate tobacco 
budworm IHeliothis virescens) was demonstrated using a leaf dip 
methodology. Fresh cotton leaves were excised from the plant and 
leaf disks were cut with an 18.5 mm cork-borer. The disks were 

15 individually emersed in control medium (PP3) or Photorhabdus 
luminescens (strain W-14) culture broth which had been 
concentrated approximately 10-fold using an Amicon (Beverly, MA) , 
Proflux M12 tangential filtration system with a 10 kDa filter. 
Excess liquid was removed and a straightened paper clip was 

20 placed through the center of the disk. The paper clip was then 
wedged into a plastic. 1.0 oz shot glass containing approximately 
2.0 ml of 1% Agar. This served to suspend the leaf disk above 
the agar. Following drying of the leaf disk, a single neonate 
tobacco budworm larva was placed on the disk and the cup was 

25 capped. The cups were then sealed in a plastic bag and placed in 
a darkened, 27°C incubator for 5 days. At this time the 
remaining larvae and leaf material were weighed to establish a 
measure of leaf damage (Table 8) . 

30 Table 8 

Effect of Photorhabdus luminescens (Strain W-14) Culture Broth on 
Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay 

Final Weights (mg) 
35 Treatment Leaf Disk Larvae 

Control leaves 55.7 ± 1.3 na* 

Control Medium 34.0+2.9 4.3+0.91 

Photorhabdus broth 54.3+1.4 0.0' 

* - not applicable, ** - no live larvae found 

40 



i * « 
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Example 5, Part A 
Characterization of Toxin Peptide Components 

In a subsequent analysis, the toxin protein subunits of the 
5 bands isolated as in Example 1 were resolved on a 7% SDS 
polyacrylamide electrophoresis gel with a ratio of 30:0.8 
(acrylamide : B IS -aery lamide ) . This gel matrix facilitates better 
resolution of the larger proteins. The gel system used to 
estimate the Band 1 and Band 2 subunit molecular weights in 
10 Example 1 was an 18% gel with a ratio of 38:0.18 (acrylamide: BIS- 
acrylamide) , which allowed for a broader range of size 
separation, but less resolution of higher molecular weight 
components . 

In this analysis, 10, rather than 8, protein bands were 
15 resolved. Table 9 reports the calculated molecular weights of 
the 10 resolved bands, and directly compares the molecular 
weights estimated under these conditions to those of the prior 
example. It is not surprising that additional bands were 
detected under the different separation conditions used in this 

20 example. Variations between the prior and new estimates of 

molecular weight are also to be expected given the differences in 
analytical conditions. In the analysis of this example, it is 
thought that the higher molecular weight estimates are more 
accurate than in Example 1, as a result of improved resolution. 

25 However, these are estimates based on SDS PAGE analysis, which 

are typically not analytically precise and result in estimates of 
peptides and which may have been further altered due to post- and 
co-translational modifications. 

Amino acid sequences were determined for the N-terminal 

30 portions of five of the 10 resolved peptides. Table 9 correlates 
the molecular weight of the proteins and the identified 
sequences. In SEQ ID NO:2. certain analyses suggest that the 
proline at residue 5 may be an asparagine (asn) . In SEQ ID NO: 3, 
certain analyses suggest that the amino acid residues at 

35 positions 13 and 14 are both arginine (arg) . In SEQ ID NO:4, 

certain analyses suggest that the amino acid residue at position 
6 may be either alanine (ala> or serine (ser). In SEQ ID NO:5, 
certain analyses suggest that the amino acid residue at position 
3 may be aspartic acid <asp) . 

40 
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Table 9 



EXAMPLE 1 



5 



ESTIMATE NEW ESTIMATE* SEQ. LISTING 

208 200.2 kDa SEQ ID NO:l 

184 175.0 kDa SEQ ID NO : 2 

65.6 68.1 kDa SEQ ID NO: 3 

60.8 65.1 kDa SEQ ID NO: 4 

56.2 58.3 kDa SEQ ID NO: 5 

25.1 23.2 kDa SEQ ID NO:15 



10 



♦New estimates are based on SDS PAGE and are not based on 
gene sequences. SDS PAGE is not analytically precise. 



Example 5, Part B 
Characterization of Toxin Peptide Components 



15 



New N-terminal sequence, SEQ ID NO: 15, Ala Gin Asp Gly Asn 
Gin Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further N- 
terminal sequencing of peptides isolated from Native HPLC- 
purified toxin as described in Example 5, Part A, above. This 



starts at position 254 and goes to position 491, where the 
TcaAiii peptide starts, SEQ ID NO: 4. The estimated size of the 
peptide based on the gene sequence is 2 5,240 Da. 



Characterization of Toxin Peptide Components 

In yet another analysis, the toxin protein complex was re- 
isolated from the Photorhabdus luminescens growth medium (after 

30 culture without Tween) by performing a 10% - 80% ammonium sulfate 
precipitation followed by an ion exchange chromatography step 
(Mono Q) and two molecular sizing chromatography steps. These 
conditions were like those used in Example 1. During the first 
molecular sizing step, a second biologically active peak was 

35 found at about 100 ± 10 kDa. Based upon protein measurements, 
this fraction was 20 - 50 fold less active than the larger, or 
primary, active peak of about 860 ± 100 kDa (native) . During 
this isolation experiment, a smaller active peak of about 325 ± 
50 kDa that retained a considerable portion of the starting 

40 biological activity was also resolved. It is thought that the 
325 kDa peak is related to or derived from the 860 kDa peak. 



20 



peptide comes from the tcaA gene. The peptide labeled TcaAii 



25 



Example 6 
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A 56 kDa protein was resolved in this analysis. The N- 
terminal sequence of this protein is presented in SEQ ID HO: 6. 
It is noteworthy that this protein shares significant identic.- 
and conservation with SEQ ID NO: 5 at the N-terminus, suggestin j 
5 that the two may be encoded by separate members of a gene family 
and that the proteins produced by each gene are sufficiently 
similar to both be operable in the insecticidal toxin complex. 

A second, prominent 185 kDa protein was consistently present 
in amounts comparable to that of protein 3 from Table 9, and may 
be the same protein or protein fragment. The N-terminal sequence 
of this 185 kDa protein is shown at SEQ ID NO: 7. 

Additional N-terminal amino acid sequence data were also 
obtained from isolated proteins. None of the determined N- 
terminal sequences appear identical to a protein identified in 
15 Table 9. Other proteins were present in isolated preparation. 

One such protein has an estimated molecular weight of 108 kDa and 
an N-terminal sequence as shown in SEQ ID NO: 8. A second such 
protein has an estimated molecular weight of 80 kDa and an N- 
terminal sequence as shown in SEQ ID NO: 9. 

When the protein material in the approximately 32 5 kDa 
active peak was analyzed by size, bands of approximately 51, 31, 
28, and 22 kDa were observed. As in all cases in which a 
molecular weight was determined by analysis of electrophoretic 
mobility, these molecular weights were subject to error effect.? 
25 introduced by buffer ionic strength differences, electrophoresis 
power differences, and the like. One of ordinary skill would 
understand that definitive molecular weight values cannot be 
determined using these standard methods and that each was subject 
to variation. it was hypothesized that proteins of these sizes 
30 are degradation products of the larger protein species (of 

approximately 2 00 kDa size) that were observed in the larger 
primary toxin complex. 

Finally, several preparations included a protein having the 
N-terminal sequence shown in SEQ ID NO: 10. This sequence was 
35 strongly homologous to known chaperonin proteins, accessory 
proteins known to function in the assembly of large protein 
complexes. Although the applicants could not ascribe such an 
assembly function to the protein identified in SEQ ID NO: 10, it 
was consistent with the existence of the described toxin protein 
complex that such a chaperonin protein could be involved in its 



20 



40 
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assembly. Moreover, although such proteins have not directly 
been suggested to have toxic activity, this protein may be 
important to determining the overall structural nature of the 
protein toxin, and thus, may contribute to the toxic activity or 
5 durability of the complex in vivo after oral delivery. 

Subsequent analysis of the stability of the protein toxin 
complex to proteinase K was undertaken. It was determined that 
after 24 hour incubation of the complex in the presence of a 10- 
fold molar excess of proteinase K, activity was virtually 
10 eliminated (mortality on oral application dropped to about 5%). 
These data confirm the proteinaceous nature of the toxin. 

The toxic activity was also retained by a dialysis membrane, 
again confirming the large size of the native toxin complex. 

15 Example 7 

Isolation, Characterization and Partial Amino Acid 
Sequencing of Photorhabdus Toxins 

Isolation and N-Terminal Amino Acid Sequencing : In a set of 

20 experiments conducted in parallel to Examples 5 and 6, ammonium 
sulfate precipitation of Photorhabdus proteins was performed by 
adjusting Photorhabdus broth, typically 2-3 liters, to a final 
concentration of either 10% or 20% by the slow addition of 
ammonium sulfate crystals. After stirring for 1 hour at 4°C, the 

25 material was centrifuged at 12,000 x g for 30 minutes. The 

supernatant was adjusted to 80% ammonium sulfate, stirred at 4°c 
for 1 hour, and centrifuged at 12,000 x g for 60 minutes. The 
pellet was resuspended in one-tenth the volume of 10 mM Na:«P0 4 , 
pH 7.0 and dialyzed against the same phosphate buffer overnight 

30 at 4°C. The dialyzed material was centrifuged at 12,000 x g for 
1 hour prior to ion exchange chromatography. 

A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was 
equilibrated with 10 mM Nai'PO^, pH 7.0. Centrifuged, dialyzed 
ammonium sulfate pellet was applied to the Q Sepharose column at 

35 a rate of 1.5 ml/min and washed extensively at 3.0 ml/min with 

equilibration buffer until the optical density (O.D. 280 > reached 
less than 0.100. Next, either a 60 minute NaCl gradient ranging 
from 0 to 0.5 M at 3 ml/min. or a series of step elutions using 
0.1 M, 0.4 M and finally 1.0 NaCl for 60 minutes each was applied 

40 to the column. Fractions were pooled and concentrated using a 
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Cencriprep 100. Alternatively, proteins could be eluted by a 
single 0.4 M NaCl wash without prior elution with 0.1 M NaCl. 

Two milliliter aliquots of concentrated Q Sepharose samples 
were loaded at 0.5 ml/min onto a HR 15/50 Superose 12 (Pharmacia) 
gel filtration column equilibrated with 10 mM Na2«P04, pH 7.0. 
The column was washed with the same buffer for 240 min at 0.5 
ml/min and 2 min samples were collected. The void volume 
material was collected and concentrated using a Centriprep 100. 
Two milliliter aliquots of concentrated Superose 12 samples were 
loaded at 0.5 ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) 
gel filtration column equilibrated with 10 mM Na^'PO^, pH 7.0. 
The column was washed with the same buffer for 2 40 min at 0.5 
ml/min and 2 min samples were collected. 

The excluded protein peak was subjected to a second 
15 fractionation by application to a gel filtration column that u^ed 
a Sepharose CL-4B resin, which separates proteins ranging from 
-30 kDa to 1000 kDa. This fraction was resolved into two peaks; 
a minor peak at the void volume (>1000 kDa) and a major peak 
which eluted at an apparent molecular weight of about 860 kDa. 
Over a one week period subsequent samples subjected to gel 
filtration showed the gradual appearance of a third peak 
(approximately 325 kDa) that seemed to arise from the major peak, 
perhaps by limited proteolysis. Bioassays performed on the three 
peaks showed that the void peak had no activity, while the 860 
25 kDa toxin complex fraction was highly active, and the 32 5 kDa 

peak was less active, although quite potent. SDS PAGE analysis 
of Sepharose CL-4B toxin complex peaks from different 
fermentation productions revealed two distinct peptide patterns, 
denoted "P" and "S". The two patterns had marked differences in 
30 the molecular weights and concentrations of peptide components in 
their fractions. The "S" pattern, produced most frequently, had 
4 high molecular weight peptides (> 150 kDa) while the "P" 
pattern had 3 high molecular weight peptides. In addition, the 
"S" peptide fraction was found to have 2-3 fold more activity 
35 against European Corn Borer. This shift may be related to 

variations in protein expression due to age of inoculum and/or 
other factors based on growth parameters of aged cultures. 

Milligram quantities of peak toxin complex fractions 
determined to be "P" or "S" peptide patt rns were subjected to 
preparative SDS PAGE, and transblotted with TRIS-glycine 
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(Seprabuff™ to PVDF membranes (ProBlocc^, Applied Biosystems) 
for 3-4 hours. Blots were sent for amino acid analysis and N- 
terminal amino acid sequencing at Harvard MicroChem and Cambridge 
ProChem, respectively. Three peptides in the "S" pattern had 
unique N- terminal amino acid sequences compared to the sequences 
identified in the previous example. A 201 kDa (TcdAn) peptide 
set forth as SEQ ID NO:. 13 below shared between 33% amino acid 
identity and 50% similarity with SEQ ID NO:l (TcbAii) (Table 10, 
in Table 10 vertical lines denote amino acid identities and 
colons indicate conservative amino acid substitutions) . a second 
peptide of 197 kDa. SEQ ID NO:14 (TcdB), had 42% identity and 58% 
homology with SEQ ID NO:2 (TcaC). Yet a third peptide of 205 kDa 
was denoted TcdAn . In addition, a limited N-terminal amino acid 
sequence, SEQ ID NO: 16 (TcbA) , of a peptide of at least 235 kDa 
was identical in homology with the amino acid sequence, SEQ id 
NO:12. deduced from a cloned gene (tcbA), SEQ ID NO:ll, 
containing a deduced amino acid sequence corresponding to SEQ ID 
NO:l (TcbAii). This indicates that the larger 235+ kDa peptide 
was proteolytically processed to the 201 kDa peptide, (TcbAii), 
(SEQ ID NO:l) during fermentation, possibly resulting in 
activation of the molecule. In yet another sequence, the 
sequence originally reported as SEQ ID NO: 5 (TcaBu) reported in 
Example 5 above, was found to contain an aspartic acid residue 
(Asp) at the third position rather than glycine (Gly) and two 
additional amino acids Gly and Asp at the eighth and ninth 
positions, respectively. In yet two other sequences, SEQ ID NO: 2 
(TcaC) and SEQ ID NO:3 (TcaBi) , additional amino acid sequence was 
obtained. Densitometric quantitation was performed using a 
sample that was identical to the "S" preparation sent for N- 
terminal analysis. This analysis showed that the 201 kDa and 197 
kDa peptides represent 7.0% and 7.2%, respectively, of the total 
Coomassie brillant blue stained protein in the "S" pattern and 
are present in amounts similar to the other abundant peptides. 
It is speculated that these peptides may represent protein 
35 homologs, analogous to the situation found with other bacterial 
toxins, such as various Cryl Bt toxins. These proteins vary from 
40-90% homology at their N-terminal amino acid sequence, which 
encompasses the toxic fragment. 
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Internal Amino Acid Sequencing : To facilitate cloning of 
toxin peptide genes, internal amino acid sequences of selected 
peptides were obtained as followed. Milligram quantities of peak 
2A fractions determined to be - P - or «S" peptide patterns were 
5 subjected to preparative SDS PAGE, and transblotted with TRls- 
glycine (Seprabuff™ to PVDF membranes (ProBlott™. Applied 
Biosystems) for 3-4 hours. Blots were sent for amino acid 
analysis and N- terminal amino acid sequencing at Harvard 
MicroChem and Cambridge ProChem. respectively. Three peptides 
referred to as TcbAu (containing SEQ id NO.-1), Tc dA ii( and TcaB 
(containing SEQ ID NO: 3) were subjected to trypsin digestion by ' 
Harvard MicroChem followed by HPLC chromatography to separate 
individual peptides. N-terminal amino acid analysis was 
performed on selected tryptic peptide fragments. Two internal 
peptides were sequenced for the peptide TcaB. (205 kDa peptide) 
referred to as TcaB.-PTlll (SEQ ID NO: 17) and TcaB,-PT79 (SEQ ID 
NO: 18). two internal peptides were sequenced for the peptide 
Tca Bi (68 kDa peptide) referred to as TcaB,-PT158 (SEQ id NO- 19) 
and Tcafi.-PTIOS (SEQ ID NO:20) . Four internal peptides were 
sequenced for the peptide TcbA U (201 kDa peptide) referred to as 
TCBAII-PT103 ( SEQ ID NO:21), TcbAii-PTSS (SEQ ID NO:22), TcbAu- 
PT81(a) (SEQ ID NO: 23 ) , and TcbAii-PT81 (b) (SEQ ID NO : 2 4 > . 



15 



20 



25 



30 



35 



40 



45 



Table 10 
N-Terminal Amino Acid Sequences 

2 °i i°r v 3 MVn e 5 t e C £ & 50% simil *rity to SEQ id no.D 
L I G Y N N g F S G * A SEQ ID NO: 13 

: ' I I : I 

PIQGYSDLFGN-A SEQ ID NO:l 

i 9 Z iPt M2% idenCit V & 58% similarity SEQ ID NO 2) 
MQNSQTFSVGEL SEQ ID NO. 14 

1 1 : I I : : I 

M Q D S P E V S I T T L SEQ ID NO. 2 




As a prerequisite for the production of Phocorhabdus insect 
toxic proteins in heterologous hosts, and for other uses it is 
necessary to isolate and characterize the genes that encode thos 



e 
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peptides. mis elective was pursued in parallel. One approach 
described later, was based on the use of monoclonal and 
polyclonal antibodies raised against the purified toxin which 
were then used to isolate clones from an expression library The 
other approach, described in this example, is based on the use of 
the N-terminal and internal amino acid sequence data to design 
degenerate oligonucleotides for use in PCR amplication. Either 
method can be used to identify DNA clones that contain the 
peptide-encoding genes so as to permit the isolation of the 
respective genes, and the determination of their DNA base 
sequence . 



GENOMIC DNA ISOLATION: Phocorhabdus luminescens strain w-14 
(ATCC accession number 55397) was grown on 2% proteose peptone #3 
agar (Difco Laboratories, Detroit, MI) and insecticidal toxin 
competence was maintained by repeated bioassay after passage, 
using the method described in Example 1 above. A 50 ml shake 
culture was produced in a 175 ml baffled flask in 2% proteose 
peptone #3 medium, grown at 28°C and 150 rpm for approximately 24 
hours. 15 ml of this culture was pelleted and frozen in its 
medium at -20°c until it was thawed for DNA isolation. The 
thawed culture was centrifuged, (700 x g, 30 min) and the 
floating orange mucopolysaccharide material was removed. The 
remaining cell material was centrifuged (25,000 x g, 15 min) to 
pellet the bacterial cells, and the medium was removed and 
discarded. 

Genomic DNA was isolated by an adaptation of the CTAB method 
described in section 2.4.1 of Current Protocols in Molecular 
Biology (Ausubel ec al. eds, John Wiley & Sons. 1994) [modified 
to include a salt shock and with all volumes increased 10-fold] . 
The pelleted bacterial cells were resuspended in TE buffer (10 mM 
Tris-HCl. 1 mM EDTA, pH 8.0) to a final volume of 10 ml, then 12 
ml of 5 M NaCl was added; this mixture was centrifuged 20 min at 
15,000 x g. The pellet was resuspended in 5.7 ml TE and 300 ml 
of 10% SDS and 60 ml of 20 mg/ml proteinase K (Gibco BRL 
Products. Grand Island, NY; in sterile distilled water) were 
added to the suspension. This mixture was incubated at 3 7«c for 
1 hr; then approximately 10 mg lysozyme (Worthington Biochemical 
Corp., Freehold. NJ) was added. After an additional 45 min, l ml 
of 5 M NaCl and 800 ml of CTAB/NaCl solution (10% w/v CTAB , 0.7 m 
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NaCl) were added, -mis pr paration was incubated 10 min at S5 e c. 
then gently agitated and further incubated and agitated for 
approximately 20 min to assist clearing of the cellular material. 
An equal volume of chloroform/ isoamyl alcohol solution (24:1, 
5 v/v) was added, mixed gently and centrifuged. After two 
extractions with an equal volume of PCI 

(phenol/chloroform/isoamyl alcohol; 50:49:1, v/v/v; equilibrated 
with 1 M Tris-HCl, pH 8.0; Intermountain Scientific Corporation, 
Kaysville, UT) , the DNA was precipitated with 0.6 volume of 
10 isopropanol. The DNA precipitate was gently removed with a glass 
rod, washed twice with 70% ethanol, dried, and dissolved in 2 ml 
STE {10 mM Tris-HCl pH 8.0, 10 mM NaCl, 1 mM EDTA) . This 
preparation contained 2.S mg/ml DNA , as determined by optical 
density at 2 60 nm (i.e., OD Ji0 ) . 

The molecular size range of the isolated genomic DNA was 
evaluated for suitability for library construction. CHEF gel 
analysis was performed in 1.5% agarose (Seakem* LE. FMC 
BioProducts, Rockland, ME) gels with 0.5 X TBE buffer (44.5 mM 
Tris-HCl pH 8.0, 44.5 mM HjBO,. 1 mM EDTA) on a BioRad CHEF-DR II 
apparatus with a Pulsewave 760 Switcher (Bio-Rad Laboratories. 
Inc.. Richmond, CA) . The running parameters were: initial A 
time, 3 sec; final A time, 12 sec; 200 volts; running 
temperature, 4-18°C; run time. 16.5 nr. Ethidium bromide 
staining and examination of the gel under ultraviolet light 
25 indicated the DNA ranged from 30-250 kbp in size. 

CONSTRUCTI ON OF LIBRARY : A partial Sau3A 1 digest was v. ide 
of this Phocorhabdus genomic DNA preparation. The method was 
based on section 3.1.3 of Ausubel (supra.). Adaptions included 
running smaller scale reactions under various conditions until 
nearly optimal results were achieved. Several scaled-up large 
reactions with varied conditions were run, the results analyzed 
on CHEF gels, and only the best large scale preparation was 
carried forward. In the optimal case, 200 ug of Phocorhabdus 
genomic DNA was incubated with 1.5 units of Sau3A 1 (New England 
Biolabs, "NEB". Beverly, MA) for 15 min at 37<>C in 2 ml total 
volume of ix NEB 4 buffer (supplied as 10X by the manufacturer) . 
The reaction was stopped by adding 2 ml of PCI and centrifuging 
at 8000 x g for 10 min. To the supernatant were added 200 ul of 
5 M NaCl plus 6 ml of ice-cold ethanol. This preparation was 
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chilled for 30 man at -20 C. then centrifuged at 12.000 :-. g for 
15 rain. The supernatant was removed and the precipitate was 
dried in a vacuum oven at 40°C. then resuspended in 400 ul c TE . 
Spectrophotometry assay indicated about 40% recovery of the 
5 input DNA. The digested DNA was size fractionated on a sucrose 
gradient according to section 5.3.2 of CPMB (op. cio. a io% 
to 40% (w/v) linear sucrose gradient was prepared with a gradient 
maker in Ultra-Clear™ tubes (Beckman Instruments, inc.. Palo 
Alto. CA) and the DNA sample was layered on top. After 

10 centrifugation. (26.000 rpm, 17 hr, Beckman SW41 rotor. 20°c; 

fractions (about 750 jil) were drawn from the top of the gradient 
and analyzed by CHEF gel electrophoresis (as described earlier). 
Fractions containing Sau3A 1 fragments in the size range 20-40 
kbp were selected and DNA was precipitated by a modification 

15 (amounts of all solutions increased approximately 6.3-fold) of 

the method in section 5.3.3 of Ausubel (supra.). After overnight 
precipitation, the DNA was collected by centrifugation <17,0u0 x 
g, 15 min) . dried, redissolved in TE. pooled into a final volume 
of 80 m. and reprecipitated with the addition of 8 (il 3 M sodium 
20 acetate and 220 ul ethanol. The pellet collected by 
centrifugation as above was resuspended in 12 ul TE. 
Concentration of the DNA was determined by Hoechst 33258 dye 
(Polysciences, Inc.. Warrington, PA) fluorometry in a Hoefer 
TKO100 fluorimeter (Hoefer Scientific Instruments. San Francesco, 
25 CA) . Approximately 2.5 ug of the size- fractionated DNA was 
recovered. 

Thirty ug of cosmid pWE15 DNA (Stratagene. La Jolla. CA) was 
digested to completion with 100 units of restriction enzyme EamH 
1 (NEB) in the manufacturer's buffer (final volume of 200 ul, 

30 37°c. 1 hr) . The reaction was extracted with 100 ul of PCI .md 
DNA was precipitated from the aqueous phase by addition of 20 ul 
3M sodium acetate and 550 ul -20°C absolute ethanol. After 10 
min at -70°C. the DNA was collected by centrifugation (17,000 x 
g, 15 min). dried under vacuum, and dissolved in 180 ul of lo mM 

35 Tris-HCl, pH 8.0. To this were added 2 0 ul of 10X CIP buffe.-. 
(100 mM Tris-HCl, pH 8.3; 10 mM ZnCl 2 ; 10 mM MgClj) , and 1 ul 
(0.25 units) of 1:4 diluted calf intestinal alkaline phosphatase 
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(Boehringer Mannheim Corporac ion. Indianapolis, IN). After 30 
min at 37°c, the following additions were made: 2 p.1 0.5 M EDTA, 
PH 8.0; 10 \ii 10% SDS; 0.5 Hi of 20 mg/ml proteinase K (as 
above), followed by incubation at 55°C for 30 min. Following 
5 sequential extractions with 100 ul of PCI and 100 ul phenol 
( Intermountain Scientific Corporation, equilibrated with 1 m 
Tris-HCl, pH 8.0), the dephosphory lated DNA was precipitated by 
addition of 72 |il of 7.5 M ammonium acetate and 550 Hi -20°C 
ethanol, incubation on ice for 30 min, and centrifugat ion as 

10 above. The pelleted DNA was washed once with 500 Hi -20°C 70% 

ethanol, dried under vacuum, and dissolved in 20 jil of TE buffer. 

Ligation of the size- fractionated Sau3A 1 fragments to the 
BamH 1 -digested and phosphatased pWE15 vector was accomplished 
using T4 ligase (NEB) by a modification (i.e., use of premixed 

15 10X ligation buffer supplied by the manufacturer) of the protocol 
in section 3.3 3 of Ausubel. Ligation was carried out overnight 
in a total volume of 20 p.1 at 15°C, followed by storage at - 
20°C. 

Four nl of the cosmid DNA ligation reaction, containing 

20 about 1 jig of DNA, was packaged into bacteriophage lambda usinu a 
commercial packaging extract (Gigapack* III Gold Packaging 
Extract, Stratagene) , following the manufacturer's directions. 
The packaged preparation was stored at 4°C until use. The 
packaged cosmid preparation was used to infect Escherichia coii 

25 XL1 Blue MR cells (Stratagene) according to the Gigapack* III 'jjld 
protocols ("Titering the Cosmid Library"), as follows. XL1 Blue 
MR cells were grown in LB medium (g/L: Bacto-tryptone, 10; Bacto- 
yeast extract, 5; Bacto-agar, 15; NaCl. 5; [Difco Laboratories, 
Detroit, MI]) containing 0.2% (w/v) maltose plus 10 mM MgSO«, at 

30 37°C. After 5 hr growth, cells were pelleted at 700 x g (15 mm) 
and resuspended in 6 ml of 10 mM MgSO,. The culture density was 
adjusted with 10 mM MgSO* to OD«,o = 0.5. The packaged cosmid 
library was diluted 1:10 or 1:20 with sterile SM medium (0.1 M 
NaCl, 10 mM MgSO«. 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin), and 

35 25 |il of the diluted preparation was mixed with 25 \il of the 

diluted XL1 Blue MR cells. The mixture was incubated at 25°'J for 
30 min (without shaking) , then 200 Hi of LB broth was added, and 
incubation was continu d for approximately 1 hr with occasional 
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gentle shaking. Aliquots (20-40 ul > of this culture were spread 
on LB agar plates containing 100 mg/1 ampicillin (i.e., LB-Aii,p. .0 
and incubated overnight at 37°c. To store the library without 
amplification, single colonies were picked and inoculated into 
5 individual wells of sterile 96-well microwell plates; each well 
containing 75 m of Terrific Broth (TB media: 12 g/i Bacto- 
tryptone, 24 g/1 Bacto-yeast extract, 0.4% v/v glycerol, 17 mM 
KH,PO<. 72 mM K 2 HPO<) plus 100 mg/1 ampicillin (i.e., TB-Amp,,^ and 
incubated (without shaking) overnight at 37<> c . After replicating 
the 96-well plate into a copy plate, 75 jtl/well of filter- 
sterilized TB glycerol (1:1. v/v; with, or without. 100 mg/1 
ampicillin) was added to the plate, it was shaken briefly at 100 
rpm. 37«c, and then closed with Parafilm* (American National -an 
Greenwich, CT) and placed in a -70°c freezer for storage. Copy ' 
plates were grown and processed identically to the master plrtes. 
A total of 40 such master plates (and their copies) were 
prepared. 
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SCREENING OF THE LI BRARY WITH RADIOLABELED DMA PRORFc; . Tc 

prepare colony filters for probing with radioactively labeled 
probes, ten 96-well plates of the library were thawed at 25°c 
(bench top at room temperature). A replica plating tool with ?<5 
prongs was used to inoculate a fresh 96-well copy plate 
containing 75 nl/well of TB-Amp l0 o. The copy plate was grown 
overnight (stationary) at 37°c. then shaken about 30 min at 100 
rpm at 37°c. a total of 800 colonies was represented in these 
copy plates, due to nongrowth of some isolates. The replica tool 
was used to inoculate duplicate impressions of the 96-well at rays 
onto Magna NT (MSI , Westboro, MA) nylon membranes (0.45 micron, 
220 x 250 mm) which had been placed on solid LB-Amp, 0 o (100 
ml/dish) in Bio-assay plastic dishes (Nunc. 243 x 243 x 18 mm; 
Curtin Mathison Scientific. Inc., Wood Dale, IL) . The colonies 
were grown on the membranes at 37°C for about 3 hr. 

A positive control colony (a bacterial clone containing a 
GZ4 sequence insert, see below) was grown on a separate Magna NT 
membrane (Nunc. 0.4 5 micron. 82 mm circle) on LB medium 
supplemented with 35 mg/1 chloramphenicol (i.e., LB-Cam„) , and 
process d alongside the library colony membranes. Bacterial 
colonies on the membranes were lysed. and the DNA was denatured 
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and neutralized according to a protocol taken from the Genius™ 
System User's Guide version 2.0 (Boehringer Mannheim. 
Indianapolis, IN). Membranes were placed colony side up on 
filter paper soaked with 0.5 N NaOH plus 1.5 M NaCl for 15 min to 
5 denature, and neutralized on filter paper soaked with 1 M Tris- 
HC1 pH 8.0, 1.5 M NaCl for 15 min. After UV-crosslinking using a 
Stratagene UV Stratalinker set on auto crosslink, the membranes 
were stored dry at 25°C until use. Membranes were trimmed into 
strips containing the duplicate impressions of a single 96-well 
10 plate, then washed extensively by the method of section 6.4.1 in 
CPMB (op. cic): 3 hr at 25°C in 3X SSC, 0.1% (w/v) SDS, followed 
by 1 hr at 65°C in the same solution, then rinsed in 2X SSC in 
preparation for the hybridization step (20X SSC = 3 M NaCl, 0.3 M 
sodium citrate. pH 7.0). 

15 

Amplification of a specific genomic fragment of a tcaC i-?ne . 
Based on the N-terminal amino acid sequence determined for the 
purified TcaC peptide fraction {disclosed herein as SEQ ID NO:2), 
a pool of degenerate oligonucleotides (pool S4Psh) was 

20 synthesized by standard P-cyanoethyl chemistry on an Applied 

BioSystem ABI3 94 DNA/RNA Synthesizer (Perkin Elmer, Foster City, 
CA) . The oligonucleotides were deprotected 8 hours at 55°C, 
dissolved in water, quantitated by spectrophotometric 
measurement, and diluted for use. This pool corresponds to the 

25 determined N-terminal amino acid sequence of the TcaC peptide. 

The determined amino acid sequence and the corresponding 

degenerate DMA sequence are given below, where A, C, G, and T are 

the standard DNA bases, and I represents inosine: 

Amino Mac Gin Asp Ser Pro Glu Val 

30 Acid 

S4Psh 5' ATG CA(A/G) GA(T/C) (T/A) (C/G) (T/A) CCI GA ( A/G ) GT 3 ' 

Another set of degenerate oligonucleotides was synthesized 
35 (pool P2.3.5R), representing the complement of the coding strand 
for the determined amino acid sequence of the SEQ ID NO:17 : 
Amino 

Acid Ala Phe Asn lie Asp Asp Val 

40 Codons 5' GCN TT(T/C) AA{T/C) AT(A/T/C) GA(T/CI GA(T/C) GT 3' 

P2.3.5R 3 'CG(A/C/G/T) AA(A/G) TT(A/G) TA(T/A/G) CT(A/G) CT(A/G) CA 5' 

These oligonucleotides were used as primers in Polymerase 
Chain Reactions (PCR", Roche Molecular Systems, Branchburg, Wj> to 
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amplify a spec 1 tic una tragment from genomic DNA prepared from 
Phocorhabdus strain W-14 (see above) . A typical reaction (50 u,D 
contained 125 pmol of each primer pool P2Psh and P2.3.5R, 253 ng 
of gen mic template DNA, 10 amol each of dATP, dCTP, dGTP, and 
5 dTTP, IX GeneAmp* PCR buffer, and 2.5 units of AmpliTaq* DNA 

polymerase (both from Roche Molecular Systems; 10X GeneAmp* buffer 
is 100 mM Tris-HCl pH 8.3, 500 mM KC1, 0.01% w/v gelatin). 
Amplifications were performed in a Perkin Elmer Cetus DNA Thermal 
Cycler (Perkin Elmer, Foster City, CA) using 35 cycles of 94°c 
10 (1.0 min) , 55°C (2.0 min) , 72°C (3.0 min) , followed by an 

extension period of 7.0 min at 72°C. Amplification products were 
analyzed by electrophoresis through 2% w/v NuSieve 1 * 3:1 agarose 
(FMC BioProducts) in TEA buffer (40 mM Tris-acetate, 2 mM EDTA, 
pH 8.0). A specific product of estimated size 250 bp was 
15 observed amongst numerous other amplification products by 

ethidium bromide (0.5 ug/ml) staining of the gel and examination 
under ultraviolet light. 

The region of the gel containing an approximately 250 bp 
product was excised, and a small plug (0.5 mm dia.) was removed 
20 and used to supply template for PCR amplification (40 cycles) . 
The reaction (50 ul) contained the same components as above, 
minus genomic template DNA. Following amplification, the endr- of 
the fragments were made blunt and were phosphorylated by 
incubation at 25°C for 20 min with 1 unit of T4 DNA polymerase? 
25 (NEB), 1 nmol ATP, and 2.15 units of T4 kinase (Pharmacia Biotech 
Inc., Piscataway. NJ) . 

DNA fragments were separated from residual primers by 
electrophoresis through 1% w/v GTG* agarose (FMC) in TEA. A yel 
slice containing fragments of apparent size 250 bp was excised, 
30 and the DNA was extracted using a Qiaex kit (Qiagen Inc., 
Chatsworth, CA) . 

The extracted DNA fragments were ligated to plasmid vector 
pBC KS( + ) (Stratagene) that had been digested to completion with 
restriction enzyme Sma 1 and extracted in a manner similar to 
35 that described for pWE15 DNA above. A typical ligation reaction 
(16.3 jil) contained 100 ng of digested pBC KS( + ) DNA. 70 ng of 
250 bp fragment DNA, 1 nmol [Co (NHi) «] Cl i , and 3.9 Weiss units of 
T4 DNA ligase (Collaborative Biomedical Products, Bedford, MA), 
in IX ligation buffer (50 mM Tris-HCl, pH 7.1; 10 mM MgCl 2 ; 10 mM 
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dithiothreitol; 1 mM spermidine, 1 mM ATP, 100 mg/ml bovine serum 
albumin) . Following overnight incubation at 14°C, the ligate! 
products were transformed into frozen, competent Escherichia o^i 
DH5a cells (Gibco BRL) according to the suppliers' 
5 recommendations, and plated on LB-Cam.<; plates , containing IPT; 
(119 ng/ml) and X-gal (50 n.g/ml). Independent white colonies 
were picked, and plasmid DNA was prepared by a modified alkaline- 
lysis/PEG precipitation method (PRISM™ Ready Reaction DyeDeoxv™ 
Terminator Cycle Sequencing Kit Protocols; ABI/Perkin Elmer) . 

10 The nucleotide sequence of both strands of the insert DNA was 
determined, using T7 primers [pBC KS( + ) bases 601-623: 
TAAAACGACGGCCAGTGAGCGCG) and LacZ primers [pBC KSU) bases 792- 
816: ATGACCATGATTACGCCAAGCGCGC ) and protocols supplied with the 
PRISM™ sequencing kit (ABI/Perkin Elmer) . Nonincorporated dye- 

15 terminator dideoxyribonucleotides were removed by passage through 
Centri-Sep 100 columns (Princeton Separations, Inc., Adelphia, 
NJ) according to the manufacturer's instructions. The DNA 
sequence was obtained by analysis of the samples on an ABI Model 
373A DNA Sequencer (ABI/Perkin Elmer) . The DNA sequences of two 

20 isolates, GZ4 and HB14, were found to be as illustrated in Figure 
1. 

This sequence illustrates the following features: 1) bases 
1-20 represent one of the 64 possible sequences of the S4Psh 
degenerate oligonucleotides, ii) the sequence of amino acids 1-3 

25 and 6-12 correspond exactly to that determined for the N- terminus 
of TcaC (disclosed as SEQ ID NO:2>, iii) the fourth amino acid 
encoded is a cysteine residue rather than serine. This difference 
is encoded within the degeneracy for the serine codons (see 
above) , iv) the fifth amino acid encoded is proline, 

30 corresponding to the TcaC N-terminal sequence given as SEQ ID 

NO: 2, v) bases 2 57-276 encode one of the 192 possible sequences 
designed into the degenerate pool, vi) the TGA termination codon 
introduced at bases 268-270 is the result of complementarity to 
the degeneracy built into the oligonucleotide pool at the 

35 corresponding position, and does not indicate a shortened reading 
frame for the corresponding gene. 

Labeling of a TcaC peptide gene-specific probe . DNA 
fragments corresponding to the above 276 bases were amplified (35 
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cycles) by PCR* in a 100 ul reaction volume, using 100 pmol each 
of P2Psh and P2.3.5R primers, 10 ng of plasmids GZ4 or HE 14 as 
templates, 20 nmol each of dATP, dCTP, dGTP, and dTTP, 5 units of 
AmpliTAq* DNA polymerase, and IX concentration of GeneAmp* buffer, 
5 under the same temperature regimes as described above. The 

amplification products were extracted from a 1% GTG* agarose gel 
by Qiaex kit and quant itated by fluorometry. 

The extracted amplification products from plasmid HB14 
template (approximately 400 ng) were split into five aliquots and 

10 labeled with K P-dCTP using the High Prime Labeling Mix 
(Boehringer Mannheim) according to the manufacturer's 
instructions. Nonincorporated radioisotope was removed by 
passage through NucTrap* Probe Purification Columns (Stratagene ) , 
according to the supplier's instructions. The specific activity 

15 of the labeled DNA product was determined by scintillation 

counting to be 3.11 x 10 a dpm/ng. This labeled DNA was used to 
probe membranes prepared from 800 members of the genomic library. 



Screening with a TcaC-peptide gene specific probe . The 

20 radiolabeled HB14 probe was boiled approximately 10 min, then 

added to "minimal hyb" solution. (Note: The "minimal hyb" method 
is taken from a CERES protocol; "Restriction Fragment Length 
Polymorphism Laboratory Manual version 4.0", sections 4-40 and 4- 
47; CERES/NPI, Salt Lake City. UT. NPI is now defunct, with its 

25 successors operating as Linkage Genetics) . "Minimal hyb" 

solution contains 10% w/v PEG (polyethylene glycol, M.W. approx. 
8000>, 7% w/v SDS; 0.6X SSC, 10 mM sodium phosphate buffer (from 
a 1M stock containing 95 g/1 NaH 2 P04»lH 2 0 and 84.5 g/1 
Na 2 HP04»7H 2 0) , 5 mM EDTA, and 100 mg/ml denatured salmon sperm 

30 DNA. Membranes were blotted dry briefly then, without 

prehybridization, 5 strips of membrane were placed in each of 2 
plastic boxes containing 75 ml of "minimal hyb" and 2.6 ng/ml cf 
radiolabeled HB14 probe. These were incubated overnight with 
slow shaking (50 rpm) at 60°C. The filters were washed three 

35 times for approximately 10 min each at 25°C in "minimal hyb wash 
solution" (0.25X SSC, 0.2% SDS), followed by two 30-min washes 
with slow shaking at 60°C in the same solution. The filters were 
placed on paper covered with Saran Wrap* (Dow Brands, 
Indianapolis, IN) in a light-tight autoradiographic cassette and 

40 exposed to X-Omat X-ray film (Kodak, Rochester, NY) with two 
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DuPonc Cronex Lightning- Plus CI enhancers (Sigma Chemical Cc . , 
St. Louis,. MO), for 4 hr at -70°C. Upon development istandard 
photographic procedures), significant signals were evident in 
both replicates amongst a high background of weaker, more 
5 irregular signals. The filters were again washed for about 4 hr 
at 68°C in "minimal hyb wash solution" and then placed again in 
the cassettes and film was exposed overnight at -70°C. Twelve 
possible positives were identified due to strong signals on both 
of the duplicate 96-well colony impressions. No signal was seen 

10 with negative control membranes (colonies of XL1 Blue MR cells 
containing pWE15), and a very strong signal was seen with 
positive control membranes (DH5a cells containing the GZ4 isolate 
of the PCR product) that had been processed concurrently with the 
experimental samples. 

15 The twelve putative hybridization-positive colonies were 

retrieved from the frozen 96-well library plates and grown 
overnight at 37°c on solid LB-Amptoo medium. They were then 
patched (3/plate, plus three negative controls: XL1 Blue MR cells 
containing the pWE15 vector) onto solid LB-Ampioo- Two sets of 

20 membranes (Magna NT nylon. 0.4 5 micron) were prepared for 

hybridization. The first set was prepared by placing a filter 
directly onto the colonies on a patch plate, then removing it 
with adherent bacterial cells, and processing as below. Filters 
of the second set were placed on plates containing LB-Ampiori 

25 medium, then inoculated by transferring cells from the patch 
plates onto the filters. After overnight growth at 37°c, the 
filters were removed from the plates and processed. 

Bacterial cells on the filters were lysed and DMA denatured 
by placing each filter colony -side- up on a pool (1.0 ml) of 0.5 II 

30 MaOH in a plastic plate for 3 min. The filters were blotted dry 
on a paper towel, then the process was repeated with fresh 0.5 M 
NaOH. After blotting dry, the filters were neutralized by 
placing each on a 1.0 ml pool of 1 M Tris-HCl, pH 7 . 5 for 3 min, 
blotted dry, and reneutralised with fresh buffer. This was 

35 followed by two similar soakings (5 min each) on pools of 0.5 M 

Tris-HCl pH 7.5 plus 1.5 M NaCl. After blotting dry, the DNA was 
UV crosslinked to the filter (as above) , and the filters were 
washed (25°C, 100 rpm) in about 100 ml of 3X SSC plus 0.1% (w/v) 
SDS (4 tim s, 30 min each with fresh solution for each wash). 

40 They were then placed in a minimal volume of prehybridizat ion 
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solution (5X SSC plus 1% w;v each of Ficoll 400 (Pharmacia:, 
polyvinylpyrrolidone (av. m.W. 360.000; Sigma ) and bovine serum 
albumin Fraction V; (Sigma)] for 2 hr at 65°C, 50 rpm. The 
prehybridization solution was removed, and replaced with the HB14 
5 K P-labeled probe that had been saved from the previous 

hybridization of the library membranes and which had been 
denatured at 95°C for 5 min. Hybridization was performed at 60°C 
for 16 hr with shaking at 50 rpm. 

Following removal of the labeled probe solution, the 

10 membranes were washed 3 times at 25°C (50 rpm, 15 min) in 3x ssc 
(about 150 ml each wash) . They were then washed for 3 hr at 68°C 
(50 rpm) in 0.25X SSC plus 0.2% SDS (minimal hyb wash solution), 
and exposed to X-ray film as described above for 1.5 hr at 25°c 
(no enhancer screens). This exposure revealed very strong 

15 hybridization signals to cosmid isolates 22G12, 25A10, 26A5, and 
26B10, and a very weak signal with cosmid isolate 8B10. No 
signal was seen with the negative control (pWE15) colonies, and a 
very strong signal was seen with positive control membranes (DH5a 
cells containing the GZ4 isolate of the PCR product) that had 
20 been processed concurrently with the experimental samples. 



Amplification of a specific genomic fragment of a tcaB gene . 
Based on the N- terminal amino acid sequence determined for the 
purified TcaBj peptide fraction (disclosed here as SEQ ID NO: 3) a 
25 pool of degenerate oligonucleotides (pool P8F) was synthesized as 
described for peptide TcaC. The determined amino acid sequence 
and the corresponding degenerate DNA sequence are given below, 
where A, C, G, and T are the standard DNA bases, and I represents 
inosine : 

30 

Amino 

Acid Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg 

P8F 5' TTT ACI CA(A/G) ACI (C/T)TI AAA GAA GCI (A/C)G 3' 

35 (C/T)TI 

Another set of degenerate oligonucleotides was synthesized 
(pool P8.108.3R). representing the complement of the coding 
strand for the determined amino acid sequence of the TcaBi-PT108 
40 internal peptide (disclosed herein as SEQ ID NO:20) : 

Amino 

Acid Met Tyr Tyr He Gin Ala Gin Gin 
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Codons ATG TA(T.'C) TA(T,'C) ATIT/C/'A) CA(A/G) GC (A/C/G-'T) CAIA/G CAiA'C 
r-.lvJ.2P. 3' AT(A/G) AT ( A/G) TA (A/G/T) GTIT/C) CGI GT(T.C) GT 5' 

TAC 

5 

These oligonucleotides were used as primers for PCR* using 
HotStarc 50 Tubes™ (Molecular Bio-Products, Inc., San Diego, CA) 
to amplify a specific DNA fragment from genomic DNA prepared from 
Phocorhajbdus strain W-14 (see above). A typical reaction (50 ul ; 

10 contained (bottom layer) 25 pmol of each primer pool P9F and 

P8.108.3R. with 2 nmol each of dATP , dCTP, dGTP, and dTTP, in IX 
GeneAmp* PCR buffer, and (top layer) 230 ng of genomic template 
DNA, 8 nmol each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of 
AmpliTaq* DNA polymerase, in IX GeneAmp'* PCR buffer. 

15 Amplifications were performed by 35 cycles as described for the 
TcaC peptide. Amplification products were analyzed by 
electrophoresis through 0.7% w/v SeaKem* LE agarose (FMC) in TEA 
buffer. A specific product of estimated size 1600 bp was 
observed . 

20 Four such reactions were pooled, and the amplified DNA was 

extracted from a 1.0% SeaKem* LE gel by Qiaex kit as described for 
the TcaC peptide. The extracted DNA was used directly as the 
template for sequence determination ( PRISM*" Sequencing Kit) using, 
the P8F and P8.108.3R primer pools. Each reaction contained 

25 about 100 ng template DNA and 2 5 pmol of one primer pool, and was 
processed according to standard protocols as described for the 
TcaC peptide. An analysis of the sequence derived from extension 
of the P8F primers revealed the short DNA sequence (and encoded 
amino acid sequence) : 

30 GAT GCA TTG NTT GCT 

Asp Ala Leu (Val) Ala 
which corresponds to a portion of the N-terminal peptide sequence 
disclosed as SEQ ID N0:3 (TcaBi) . 

35 Labeling of a TcaB i -peptide gene-specific probe . 

Approximately 50 ng of gel-purified TcaBi DNA fragment was 
labeled with 12 P-dCTP as described above, and nonincorporated 
radioisotopes were removed by passage through a NICK Column* 
(Pharmacia). The specific activity of the labelled DNA was 

40 determined to be 6 x 10° dpm/^g. This labeled DNA was used to 
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probe colony membranes prepared from members of the genomic 
library that had hybridized to the TcaC-peptide specific probe. 

The membranes containing the 12 colonies identified in the 
TcaC-probe library screen (see above) were stripped of 
5 radioactive TcaC-specif ic label by boiling twice for 

approximately 30 min each time in 1 liter of 0 . IX SSC plus 0.1 % 
SDS. Removal of radiolabel was checked with a 6 hr film 
exposure. The stripped membranes were then incubated with the 
TcaBi peptide-specif ic probe prepared above. The labeled dna was 

10 denatured by boiling for 10 min, and then added to the filters 
that had been incubated for 1 hr in 100 ml of "minimal hyb" 
solution at 60°C. After overnight hybridization at this 
temperature, the probe solution was removed, and the filters were 
washed as follows (all in 0.3X SSC plus 0.1% SDS* : once for 5 min 

15 at 25°C, once for 1 hr at 60°C in fresh solution, and once for 1 
hr at 63°C in fresh solution. After 1.5 hr exposure to X-ray 
film by standard procedures, 4 strongly-hybridizing colonies were 
observed. These were, as with the TcaC-specif ic probe, isolates 
22G12, 25A10, 26A5, and 26B10. 

20 The same TcaBi probe solution was diluted with an equal 

volume (about 100 ml) of "minimal hyb" solution, and then used to 
screen the membranes containing the 800 members of the genomic 
library. After hybridization, washing, and exposure to X-ray 
film as described above, only the four cosmid clones 22G12. 

25 25A10, 26A5, and 26B10, were found to hybridize strongly to this 
probe . 

ISOLATION OF SUBCLONES CONTAINING GEI1ES ENCODING TcaC AIID 
TcaBi PEPTIDES, AND DETERMINATION OF DNA BASE SEQUENCE THEREOF : 

30 Three hybridization-positive cosmids in strain XL1 Blue MR were 
grown with shaking overnight (200 rpm) at 3 0°C in 100 ml TB- 
Ampi5c. After harvesting the cells by centrif ugation , cosmid DMA 
was prepared using a commercially available kit (BIGprep™, 5 
Prime 3 Prime, Inc., Boulder, CO), following the manufacturer's 

35 protocols. Only one cosmid, 26A5, was successfully isolated by 
this procedure. When digested with restriction enzyme EcoR 1 
(NEB) and analyzed by gel electrophoresis, fragments of 
approximate sizes 14, 10, 3 (vector), 5, 3.3. 2.9, and 1.5 kbp 
were detected. A second attempt to isolate cosmid DMA from the 

40 same three strains (8 ml cultures; TB-Ampmu , 30°C) utilized a 
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bailing miniprep method (Evans G. and G. Wahl . , 198", "Cosrnid 
vectors for genomic walking and rapid restriction mapping." in 
Guide to Molecular Cloning Techniques. Meth. Enzymoloqy , vol. 
152. S. Berger and A. Kimmel, eds . , pgs . 604-610). Only one 
5 cosrnid, 25A10, was successfully isolated by this method. When 
digested with restriction enzyme EcoR 1 (NEB) and analyzed by gel 
electrophoresis, this cosrnid showed a fragmentation pattern 
identical to that previously seen with cosrnid 26A5. 

A 0.15 ng sample of 26A5 cosrnid DNA was used to transform 50 

10 ml of E. coli DH5a cells (Gibco BRL) , by the supplier's 

protocols. A single colony isolate of that strain was inoculated 
into 4 ml of TB-Ampi )u , and grown for 8 hr at 37°c. 
Chloramphenicol was added to a final concentration of 225 ng/ml, 
incubation was continued for another 24 hr, then cells were 

15 harvested by centrif ugation and frozen at -20°C. Isolation of 
the 26A5 cosrnid DNA was by a standard alkaline lysis miniprep 
(Maniatis et al., op. cit., p. 382), modified by increasing all 
volumes by 50% and with stirring or gentle mixing, rather than 
vortexing, at every step. After washing the DNA pellet in 70% 
20 ethanol, it was dissolved in TE containing 25 ug/ml ribonuclease 
A (Boehringer Mannheim) . 



Identification of EcoR 1 fragments hybridizing to GZ4- 
derived and TcaB j - probes . Approximately 0.4 \ig of cosrnid 25A10 

25 (from XL1 Blue MR cells) and about 0.5 \iq of cosrnid 26A5 (from 
chloramphenicol-amplif ied DH5a cells) were each digested with 
about 15 units of EcoR 1 (NEB) for 85 min, frozen overnight, ":hen 
heated at 65°C for five min, and electrophoresed in a 0.7% 
agarose gel (Seakem' LE, IX TEA, 80 volts, 90 min). The DNA was 

30 stained with ethidium bromide as described above, and 

photographed under ultraviolet light. The EcoR 1 digest of 
cosrnid 25A10 was a complete digestion, but the sample of cosrnid 
26A5 was only partially digested under these conditions. The 
agarose gel containing the DNA fragments was subjected to 

35 depurinat ion, denaturation and neutralization, followed by 

Southern blotting onto a Magna NT nylon membrane, using a high 
salt (20X SSC) protocol, all as described in section 2.9 of 
Ausubel et al. (CPMB, op. cit.). The transferred DNA was then 
UV-crosslinked to the nylon membrane as before. 
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An TcaC-peptide specific DMA fragment corresponding t o the 
insert of piasmid isolate GZ4 was amplified by PCR* in a 100 ml 
reaction volume as described previously above. The amplification 
products from three such reactions were pooled and were extracted 
5 from a 1% QTG" agarose gel by Qiaex kit, as described above, and 
quantitated by fluorometry. The gel-purified DNA {100 ng) was 
labeled with i: P-dCTP using the High Prime Labeling Mix 
(Boehringer Mannheim) as described above, to a specific activity 
of 6.3 4 x 10' dpm/ng. 

10 The ! "P- labeled GZ4 probe was boiled 10 min, then added to 

"minimal hyb" buffer (at 1 ng/ml), and the Southern blot membrane 
containing the digested cosmid DNA fragments was added, and 
incubated for 4 hr at 60°C with gentle shaking at 50 rpm. The 
membrane was then washed 3 times at 25°C for about 5 min each 

15 (minimal hyb wash solution) , followed by two washes for JO min 
each at 60°C. The blot was exposed to film '.with enhancer 
screens) for about 30 min at -70°C. The GZ4 probe hybridized 
strongly to the 5.0 kbp (apparent size) EcoR 1 fragment ->f both 
these two cosmids , 26A5 and 2 5A10. 

20 The membrane was stripped of radioactivity by boiling for 

about 30 min in 0 . IX SSC plus 0.1 % SDS, and absence of 
radiolabel was checked by exposure to film. It was then 
hybridized at 60°C for 3.5 hours with the (denatured) TcaBi probe 
in "minimal hyb" buffer previously used for screening the colony 

25 membranes (above), washed as described previously, and exposed to 
film for 40 min at -70°C with two enhancer screens. With both 
cosmids, the TcaBi probe hybridized lightly with the about 5.0 
kbp EcoR 1 fragment, and strongly with a fragment of 
approximately 2.9 kbp. 

30 The sample of cosmid 26A5 DNA previously described, (from 

DH5a cells) was used as the source of DNA from which to subclone 
the bands of interest. This DNA (2.5 ng) was digested with about 
3 units of EcoR 1 (NEB) in a total volume of 30 nl for 1.5 hr, to 
give a partial digest, as confirmed by gel electrophoresis. Ten 
35 \ig of pBC KS ( + ) DNA (Stratagene) were digested for 1.5 hr with 
20 units of EcoR 1 in a total volume of 20 jil, leading to total 
digestion as confirmed by electrophoresis. Both EcoR 1-cut DNA 
preparations were diluted to 50 Hi with water, to each an equal 
volume of PCI was added, the suspension was gently mixed, spun in 
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a microcentrifuge and the aqueous supernatant was collected. r 
was precipitated by 150 m ethanol, and the mixture was placed 
-20*C overnight. Following cent ri fugat ion and drying, the EcoR 
1 -digested pBC KS < + ) was dissolved in 100 m TE; the partially 
5 digested 26A5 was dissolved in 20 nl TE. DNA recovery was 
checked by £ luorometry . 

In separate reactions, approximately 60 ng of EcoR I- 
digested pBC KS f + I DMA was ligated with approximately 180 ng or 
270 ng of partially digested cosmid 26A5 DNA. Ligations were 
carried out in a volume of 20 jil at 15°C for 5 hr, using T4 
ligase and buffer from New England BioLabs. The ligation 
mixture, diluted to 100 nl with sterile TE, was used to transfo 
frozen, competent DH5a cells (Gibco BRL) according to the 
supplier's instructions. Varying amounts (25-200 ^1) of the 
15 transformed cells were plated on freshly prepared solid LB-Cam^s 
medium with 1 mM IPTG and 50 mg/1 X-gal. Plates were incubated 
at 37o c about 20 hr, then chilled in the dark for approximately 
hr to intensify color for insert selection. white colonies vers 
picked onto patch plates of the same composition and incubated 
20 overnight at 37°c. 

Two colony lifts of each of the selected patch plates were 
prepared as follows. After picking white colonies to fresh 
plates, round Magna NT nylon membranes were pressed onto the 
patch plates, the membrane was lifted off, and subjected to 
denaturation. neutralization and UV crosslinking as described 
above for the library colony membranes. The crosslinked colony 
lifts were vigorously washed, including gently wiping off the 
excess cell debris with a tissue. One set was hybridized with 
the GZ4(TcaC) probe solution described earlier, and the other se 
was hybridized with the TcaBi probe solution described earlier, 
according to the 'minimal hyb' protocol, followed by washing and 
film exposure as described for the library colony membranes. 

Colonies showing hybridization signals either only with the 
GZ4 probe, with both GZ4 and TcaBi probes, or only with the TcaB 
probe, were selected for further work and cells were streaked fo 
single colony isolation onto LBrCam,, media with IPTG and X-gal a 
before. Approximately 35 single colonies, from 16 different 
isolates, were picked into liquid LB-cam^ media and grown 



25 
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overnight at 2~?°C; the cells were collected by cent rifugat ion and 
plasmid DNA was isolated by a standard alkaline lysis miniprep 
according to Maniatis ec al. (op. cic. p. 368) . DNA pellets were 
dissolved in TE * 25 Jig/ml ribonuclease A and DNA concentration 
5 was determined by fluorometry. The EcoR 1 digestion pattern was 
analyzed by gel electrophoresis. The following isolates were 
picked as useful. Isolate A17.2 contains religated pBC KS ( + ) 
only and was used for a (negative) control. Isolates D38.3 and 
C44.1 each contain only the 2.9 kbp, TcaBi -hybridizing EcoR 1 
!0 fragment inserted into pBC KS(+). These plasmids, named pDAB2000 
and pDAB2001, respectively, are illustrated in Fig. 2. 

Isolate A35.3 contains only the approximately 5 kbp, GZ4>- 
hybridizing EcoR 1 fragment, inserted into pBC KS(+). This 
plasmid was named pDAB2002 (also Fig. 2). These isolates 

15 provided templates for DNA sequencing. 

Plasmids pDAB2000 and pDAB2001 were prepared using the 
BIGprep™ kit as before. Cultures (30 ml) were grown overnight in 
TB-Camjs to an OD 60 o of 2, then plasmid was isolated according to 
the manufacturer's directions. DNA pellets were redissolved in 

20 100 ul TE each, and sample integrity was checked by EcoR 1 
digestion and gel electrophoretic analysis. 

Sequencing reactions were run in duplicate, with one 
replicate using as template pDAB2000 DNA, and the other replicate 
using as template pDAB2001 DNA. The reactions were carried out 

25 using the dideoxy dye terminator cycle sequencing method, as 

described above for the sequencing of the GZ4/HB14 DNAs. Initial 
sequencing runs utilized as primers the LacZ and T7 primers 
described above, plus primers based on the determined sequence of 
the TcaB| PCR amplification product < TH1 = 

30 ATTGCAGACTGCCAATCGCTTCGG , TH12 = GAGAGTATCCAGACCGCGGATGATCTG ) . 

After alignment and editing of each sequencing output, each 
was truncated to between 250 to 350 bases, depending on the 
integrity of the chromatographic data as interpreted by the 
Perkin Elmer Applied Biosystems Division SeqEd 675 software. 

35 Subsequent sequencing "steps" were made by selecting appropriate 
sequence for new primers. With a few exceptions, primers 
(synthesized as described above) were 24 bases in length with a 
50% G-t-C composition. Sequencing by this method was carried out 
on both strands of the approximately 2.9 kbp EcoR 1 fragment. 
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To further serve as template tor DMA sequencing, plasmid DIIA 
from isolate pDAB2002 was prepared by BIGprep™ kit. Sequencing 
reactions were performed and analyzed as described above. 
Initially, a T3 primer (pBS SK ( + ) bases 774-796: 
5 CGCGCAATTAACCCTCACTAAAG > and a T7 primer (pBS KS (+) bases 621- 
643: GCGCGTAATACGACTCACTATAG ) were used to prime the sequencing 
reactions from the flanking vector sequences, reading into the 
insert DNA . Another set of primers, (GZ4F : 

GTATCGATTACAACGCTGTCACTTCCC ; TH13 : GGGAAGTGACAGCGTTGTAATCGATAC; 

10 TH14: ATGTTGGGTGCGTCGGCTAATGGACATAAC ; and LW1-204: 

GGGAAGTGACAGCGTTGTAATCGATAC) was made Co prime from internal 
sequences, which were determined previously by degenerate 
ol igonucleoc ide-mediated sequencing of subcloned TcaC-peptide PCR 
products. From the data generated during the initial rounds of 

15 sequencing, new sets of primers were designed and used to walk 
the entire length of the -5 kbp fragment. A total of 55 oLigo 
primers was used, enabling the identification of 4832 total bp of 
contiguous sequence. 

When the DNA sequence of the EcoR 1 fragment insert of 

20 pDAB2002 is combined with part of the determined sequence of the 
pDAB2 000/pDAB2001 isolates, a total contiguous sequence of 6005 
bp was generated (disclosed herein as SEQ ID NO:25) . When long 
open reading frames were translated into the corresponding amino 
acids, the sequence clearly shows the TcaBi N-terminal peptide 

25 (disclosed as SEQ ID NO:3), encoded by bases 19-75, immediately 
following a methionine residue (start of translation). Upstream 
lies a potential ribosome binding site (bases 1-9), and 
downstream, at bases 166-228 is encoded the TcaBi~PT158 internal 
peptide (disclosed herein as SEQ ID NO:19). Further downstream, 

30 in the same reading frame, at bases 1738-1773, exists a sequence 
encoding the TcaBi-PT108 internal peptide (disclosed herein as 
SEQ ID NO-.20). Also in the same reading frame, at bases 1897- 
1923, is encoded the TcaBii N-terminal peptide (disclosed herein 
as SEQ ID NO:5), and the reading frame continues uninterrupted tc 

35 a translation termination codon at nucleotides 3586-3588. 

The lack of an in-frame stop codon between the end of the 
sequence encoding TcaBj-PTlOS and the start of the TcaBii encoding 
region, and the lack of a discernible ribosome binding site 
immediately upstream of the TcaBii coding region, indicate that 
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peptides TcaBii and TcaBi are encoded by a single open readir.? 
frame of 3567 bp beginning at base pair 16 in SEQ ID MO:25), and 
are most likely derived from a single primary gene product of 
1189 amino acids (131,586 Daltons; disclosed herein as SEQ ID 
5 NO:26) by post-translational cleavage. If the amino acid 

immediately preceding the TcaBii N-terminal peptide represents 
the C-terminal amino acid of peptide TcaBi, then the predicted 
mass of TcaBii (627 amino acids) is 70,814 Daltons (disclosed 
herein as SEQ ID NO:28), somewhat higher than the size observed 
10 by SDS-PAGE (68 kDa). This peptide would be encoded by a 

contiguous stretch of 1881 base pairs (disclosed herein as SEQ ID 
NO:27) . it is thought that the native C-terminus of TcaBi lies 
somewhat closer to the C-terminus of TcaBj.-PT108. The molecular 
mass of PT108 [3.438 kDa; determined during N-terminal amino acid 

15 sequence analysis of this peptide] predicts a size of 30 amino 
acids. Using the size of this peptide to designate the C- 
terminus of the TcaBi coding region [Glu at position 604 of SEQ 
ID NO:28], the derived size of TcaBi is determined to be 604 
amino acids or 68,463 Daltons, more in agreement with 

20 experimental observations. 

Translation of the TcaBii peptide coding region of 1686 base 
pairs (disclosed herein as SEQ ID NO:29) yields a protein of 562 
amino acids (disclosed herein as SEQ ID NO: 30) with predicted 
mass of 60,789 Daltons, which corresponds well with the observed 

25 61 kDa. 

A potential ribosome binding site (bases 3633-3638) is found 
48 bp downstream of the stop codon for the tcaB open reading 
frame. At bases 3645-3677 is found a sequence encoding the N- 
terminus of peptide TcaC, (disclosed as SEQ ID NO. 2). The open 

30 reading frame initiated by this N-terminal peptide continues 

uninterrupted to base 6005 (2361 base pairs, disclosed herein as 
the first 2361 base pairs of SEQ ID NO. 31). A gene (tcaC) 
encoding the entire TcaC peptide, (apparent size -165 kDa; -1500 
amino acids), would comprise about 4500 bp. 

35 Another isolate containing cloned EcoR 1 fragments of cosmid 

26A5, E20.6, was also identified by its homology to the 
previously mentioned GZ4 and TcaBi probes. Agarose gel analysis 
of EcoR 1 digests of the DNA of the plasmid harbored by this 
strain (pDAB2004, Fig. 2), revealed insert fragments of estimated 
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sixes 2.9. 5, and 3.3 kbp. DNA sequence analysis initiated from 
primers designed from the sequence of plasmid pDAB20Q2 revealed 
that the 3.3 kbp EcoR 1 fragment of pDAB2004 lies adjacent to the 
5 kbp EcoR 1 fragment represented in pDAB2002. The 2361 base 
5 pair open reading frame discovered in pDAB2002 continues 

uninterrupted for another 2094 bases in pDAB2004 [disclosed 
herein as base pairs 2362 to 4458 of SEQ ID NO:31). DNA sequence 
analysis using the parent cosmid 26A5 DNA as template confirmed 
the continuity of the open reading frame. Altogether, the open 

10 reading frame (TcaC SEQ ID NO: 31) comprises 4455 base pairs, and 
encodes a protein (TcaC) of 1485 amino acids [disclosed herein as 
SEQ ID NO:32]. The calculated molecular size of 166,214 Daltons 
is consistent with the estimated size of the TcaC peptide (165 
kDa), and the derived amino acid sequence matches exactly that 

15 disclosed for the TcaC N-terminal sequence [SEQ ID NO:2). 

The lack of an amino acid sequence corresponding to SEQ ID 
MO: 17; used to design the degenerate oligonucleotide primer pool 
in the discovered sequence indicates that the generation of the 
PCR® products found in isolates GZ4 and HB14, which were used as 

20 probes in the initial library screen, were fortuitously generated 
by reverse-strand priming by one of the primers in the degenerate 
pool. Further, the derived protein sequence does not include the 
internal fragment disclosed herein as SEQ ID NO: 18. These 
sequences reveal that plasmid pDAB2004 contains the complete 

25 coding region for the TcaC peptide. 

Example 9 

Screening of the Photorhabdus genomic library 
for genes encoding the TcbA jj peptide 

30 

This example describes a method used to identify DNA clones 
that contain the TcbAj,i peptide-encoding genes, the isolation of 
the gene, and the determination of its partial DNA base sequence. 

35 Primers and PCR reactions 

The TcbAii polypeptide of the insect active preparation is 
-206 kDa. The amino acid sequence of the N-terminus of this peptide 
is disclosed as SEQ ID NO:l. Four pools of degenerate 
oligonucleotide primers ("Forward primers": TH-4, TH-5, TH-6, and 
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TH-") were synthesized to encode a portion of this amino acid 
sequence, as described in Example 8, and are shown below. 



Table 11 

5 Amino 



Acid 


Phe 


lie 


Gin 


Gly 


Tyr 


Ser 


Asp 


Leu 


Phe 


TH-4 


5' -TT(T/C) 


ATI 


CA(A/G) 


GGI 


TA(T/C) 


TCI 


GA(T/C) 


CTI 


TT-3' 


TH-5 


5' -TT(T/C) 


ATI 


CAIA/G) 


GGI 


TAIT/C) 


AG(T/C) 


GA(T/C) 


CTI 


TT-3 ' 


TH-6 


5' -TT(T/C) 


ATI 


CA(A/G) 


GGI 


TA(T/C) 


TCI 


GA(T/C) 


TT (A/G) 


TT-3' 


TH-7 


5' -TT(T/C) 


ATI 


CA(A/G) 


GGI 


TA(T/C) 


AG(T/C) 


GAIT/C) 


TT (A/G) 


TT-3' 



In addition, a primary ("a") and a secondary ("b") sequence 
of an internal peptide preparation (TcbAii-PT81) have been 
determined and are disclosed herein as SEQ ID No: 23 and SEQ ID 
15 No:24, respectively. Four pools of degenerate oligonucleotides 
("Reverse Primers": TH-8, TH-9, TH-10 and TH-11) were similarly 
designed and synthesized to encode the reverse complement of 
sequences that encode a portion of the peptide of SEQ ID NO: 23, 
as shown below. 
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Sets of these primers were used in PCR" reactions to ampii-.y 
TcbAii- encoding gene fragments from the genomic Ptiocorhabdus 
luminescens W-14 DMA prepared in Example 6. Ail PCR* reactions 
were run with the "Hot Start" technique using AmpliWax™ gems and 
5 other Perkin Elmer reagents and protocols. Typically, a mixture 
(total volume 11 \il ) of MgCl/, dNTP ' s . 10X GeneAmp* PCR Buffer II. 
and the primers were added to tubes containing a single wax bead . 
[10X GeneAmp* PCR Buffer II is composed of 100 mM Tris-HCl. pH 
8.3; and 500 mM KC1 . ] The tubes were heated to 80°C for 2 

10 minutes and allowed to cool. To the top of the wax seals, a 

solution containing 10X GeneAmp* PCR Buffer II, DNA template, snJ 
AmpliTaq* DNA polymerase were added. Following melting of the wax 
seal and mixing of components by thermal cycling, finai reaction 
conditions (volume of 50 p.1 ) were: 10 mM Tris-HCl, pH 8.3; 50 mM 

15 KC1; 2.5 mM MgCl2; 2 00 JlM each in dATP, dCTP, dGTP, dTTP; 1.25 mM 
in a single Forward primer pool; 1.25 |1M in a single Reverse 
primer pool, 1.25 units of AmpliTaq 1 * DNA polymerase, and 170 ny of 
template DNA. 

The reactions were placed in a thermocycler (as in 

20 Example 8) and run with the following program: 



Table 13 



Temperature 


Time 


Cycle 

Repetition 


94°C 


2 minutes 


IX 








94°C 


15 seconds 




55-65°C 


30 seconds 


30X 


72°C 


1 minute 






72°C 


7 minutes 


IX 




15°C 


Constant 
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A series of amplifications was run at three different 
annealing temperatures (55°, 60°, 65° C) using the degenerate 
5 primer pools. Reactions with annealing at 65°C had no 
amplification products visible following agarose gel 
electrophoresis. Reactions having a 60°C annealing regime and 
containing primers TH-5+TH-10 produced an amplification product 
that had a mobility corresponding to 2.9 kbp. A lesser amount of 

10 the 2.9 kbp product was produced under these conditions with 

primers TH-7+TH-10. When reactions were annealed at 55°C, these 
primer pairs produced more of the 2.9 kbp product, and this 
product was also produced by primer pairs TH-5+TH-8 and TH-5+TH- 
11. Additional very faint 2.9 kbp bands were seen in lanes 

15 containing amplification products from primer pairs TH-7 plus TH- 
8, TH-9, TH-10, or TH-11. 

To obtain sufficient PCR amplification product for cloning 
and DNA sequence determination, 10 separate PCR reactions were 
set up using the primers TH-5+TH-10, and were run using the above 

20 conditions with a 55°C annealing temperature. All reactions were 
pooled and the 2.9 kbp product was purified by Qiaex extraction 
from an agarose gel as described above. 

Additional sequences determined for TcbAii internal peptides 
are disclosed herein as SEQ ID N0:21 and SEQ ID NO:22. As 

25 before, degenerate oligonucleotides (Reverse primers TH-17 and 
TH-18) were made corresponding to the reverse complement of 
sequences that encode a portion of the amino acid sequence of 
these peptides. 

30 Table 14 

From SBQ ID HO: 21 

Amino 

^ Acid Met Glu Thr Gin Asn He Gin Glu Pro 

TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI GTT/C GTT/C GG-5' 

Table 15 

40 From SBQ ID NO : 22 

Amino 

Acid Asn Pro He Asn He Asn Thr Gly He Asp 

45 TH-18 3'-TT(A/G) GGI TAI TT (A/G) TAI TT(A?G) TGI CCI TAI CT(A/G)-5' 
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Degenerate oligonucleotides TH-18 and TH-17 were used in an 
amplification experiment with phocorhabdus luminescens w-14 DMA 
as template and primers TH-4. TH-5, TH-6, or TH-7 as the 5'- 
■Forward) primers. These reactions amplified products of 
5 approximately 4 kbp and 4.5 kbp, respectively. These DNAs were 
transferred from agarose gels to nylon membranes and hybridized 
v.'ith a "P- labeled probe (as described above) prepared from the 
2.9 kbp product amplified by the TH-5+TH10 primer pair. Both the 
4 kbp and the 4.5 kbp amplification products hybridized strongly 
10 to the 2.9 kbp probe. These results were used to construct a map 
ordering the TcbAii internal peptide sequences as shown in 
Fig. 3. Approximate distances between the primers are shown in 
nucleotides in Fig. 3. 

15 CNA Sequence of the 2.9 kbp TcbA jj -encoding fragment 

Approximately 200 ng of the purified 2.9 kbp fragment 
(prepared above) was precipitated with ethanol and dissolved in 
17 ml of water. One-half of this was used as sequencing template 
with 2 5 pmol of the TH-5 pool as primers, the other half was used 

20 as template for TH-10 priming. Sequencing reactions were as 

given in Example 8. No reliable sequence was produced using the 
TH-10 primer pool; however, reactions with TH-5 primer pool 
produced the sequence disclosed below: . 

1 AATCGTGTTG ATCCCTATGC CGNGCCGGGT TCGGTGGAAT CGATGTCCTC ACCGGGGGTT 

25 51 TATTNGAGGG ANTNGTCCCG TGAGGCCAAA AANTGGAATG AAAGAAGTTC AATTTNTTAC 

121 CTAGATAAAC GTCGCCCGGN TTTAGAAAGN TTANTGNTCA GCCAGAAAAT TTTGGTTGAG 

131 GAAATTCCAC CGNTGGTTCT CTCTATTGAT TNGGCCCTGG CCGCCTTCGA ANNAAAACMA 

241 GGAAATNCAC AAGTTGAGGT GATGGNTTTG TWGCNAMCTT NTCGTTTAGG TGGGGAGAAA 

301 CCTTNTCANC ACGNTTNTGA AACTGTCCGG GAAATCGTCC ATGANCGTGA NCCAGGNTTN 

30 3 61 CGCCATTGG 

Based on this sequence, a sequencing primer (TH-21, 5'- 
CCGGGCGACGTTTATCTAGG-3 ' ) was designed to reverse complement bases 
120-139, and initiate polymerization towards the 5' end (i.e., 
35 TH-5 end) of the gel-purified 2.9 kbp TcbAii -encoding PCR 
fragment. The determined sequence is shown below, and is 
compared to the biochemically determined N- terminal peptide 
sequence of TcbAii SEQ ID NO:l. 
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TcbA j j 2.? kbp PCP. fragment Sequence Confirmation 

[Underlined amino acids = encoded by degenerate oligonucleotides- 

SEQ ID NO:l FIQGYSDLF G - - A 

5 i I I I I I I I I I i 

2.9 kbp seq GC ATG CAG GGG TAT AGT GAC CTG TTT GGT AAT CGT GCT 

M Q G { S D L F G M R A .• 

From the homology of the derived amino acid sequence to the 
10 biochemically determined one, it is clear that the 2.9 kbp PCR 
fragment represents the TcbA coding region. This 2.9 kbp 
fragment was then used as a hybridization probe to screen the 
Photorhabdus W-14 genomic library prepared in Example 8 for 
cosmids containing the TcbAii-encoding gene. 

15 

Screening the Photorhabdus cosmid library 

The 2.9 kb gel-purified PCR fragment was labeled with !i P 
using the Boehringer Mannheim High Prime labeling kit as 
described in Example 8. Filters containing remnants of 

20 approximately 800 colonies from the cosmid library were screened 
as described previously (Example 8) , and positive clones were 
streaked for isolated colonies and rescreened. Three clones 
(8A11, 25G8, and 26D1) gave positive results through several 
screening and characterization steps. No hybridization of the 

25 TcbAii- specific probe was ever observed with any of the four 

cosmids identified in Example 8, and which contain the ccaB and 
ccaC genes. DNA from cosmids 8A11, 25G8, and 26D1 was digested 
with restriction enzymes Bgl 2, EcoR 1 or Hind 3 (either alone or 
in combination with one another), and the fragments were 

30 separated on an agarose gel and transferred to a nylon membrane 
as described in Example 8. The membrane was hybridized with -p- 
labeled probe prepared from the 4.5 kbp fragment (generated by 
amplification of Photorhabdus genomic DNA with primers TH-5+TH- 
17). The patterns generated from cosmid DNAs 8A11 and 26D1 were 

35 identical to those generated with similarly-cut genomic DNA on 
the same membrane. It is concluded that cosmids 8A11 and 2 5D1 
are accurate representations of the genomic TcbAii encoding 
locus. However, cosmid 25G8 has a single Bgl 2 fragment which is 
slightly larger than the genomic DNA. This may result from 

40 positioning of the insert within the vector. 
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DMA sequence of the ccbA-encoding gene 

The membrane hybridization analysis of cosmid 25D1 reveaisd 
that the 4.5 kbp probe hybridized to a single large EcoR 1 
fragment (greater than 9 kbp) . This fragment was gel purified 
5 and ligated into the EcoR 1 site of pBC KS (+) as described in 
Example 8, to generate plasmid pBC-Sl/Rl. The partial DMA 
sequence of the insert DMA of this plasmid was determined by 
"primer walking" from the flanking vector sequence, using 
procedures described in Example 8. Further sequence was 

10 generated by extension from new oligonucleotides designed from 
the previously determined sequence. When compared to the 
determined DNA sequence for the ccbA gene identified by other 
methods (disclosed herein as SEQ ID NO: 11 as described in Example 
12 below), complete homology was found to nucleotides 1-272, 319- 

15 826, 2578-3036, and 3068-3540 (total bases = 1712). It was 
concluded that both approaches can be used to identify DNA 
fragments encoding the TcbAii peptide. 



Analysis of the derived amino acid sequence of the ccbA gene . 

20 The sequence of the DNA fragment identified as SEQ ID NO: 11 

encodes a protein whose derived amino acid sequence is disclosed herein 
as SEQ ID NO: 12. Several features verify the identity of the gene as 
that encoding the TcbAii protein. The TcbAii N-terminal peptide (SEQ 
ID M0:1; Phe lie Gin Gly Tyr Ser Asp Leu Phe Giy Asn Arg Ala) is 

25 encoded as amino acids 88-100. The TcbAii internal peptide TcbAii - 

PT81(a) (SEQ ID NO:23) is encoded as amino acids 1065-1077, and TcbAu- 
PT81(b) (SEQ ID NO:24> is encoded as amino acids 1571-1592. Further, 
the internal peptide TcbAii-PT56 (SEQ ID NO: 22) is encoded as amino 
acids 1474-1488, and the internal peptide TcbAii-PT103 (SEQ ID NO:24) 

30 is encoded as amino acids 1614-1639. It is obvious that this gene is 
an authentic clone encoding the TcbAii peptide as isolated from 

insecticidal protein preparations of Phocortiabdus luminescens strain 
W-14. 

The protein isolated as peptide TcbAii is derived from cleavage 
35 of a longer peptide. Evidence for this is provided by the fact that 

the nucleotides encoding the TcbAii N-terminal peptide SEQ ID NO:l are 
preceded by 261 bases (encoding 87 N-terminal -proximal amino acids) of 
a longer open reading frame (SEQ ID NO: 11). This reading frame begins 
with nucleotides that encode the amino acid sequence Met Gin Asn Ser 
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Leu, which correspouub -,o cue n-terminai sequence ot cne large peptide 
TcbA, and is disclosed herein as SEQ ID 110:16. It is thought that TcbA 
is the precursor protein for TcbAii. 



5 Relationship of ccbA. zcaB and ccaC genes . 

The ccaB and ccaC genes are closely linked and may be 
transcribed as a single mRNA (Example 8). The ccbA gene is boine 
on cosmids that apparently do not overlap the ones harboring the 
ccaB and ccaC cluster, since the respective genomic library 
It) screens identified different cosmids. However, comparison of the 
amino sequences encoded by the ccaB and tcaC genes with the ccbA 
gene reveals a substantial degree of homology. The amino acid 
conservation (Protein Alignment Mode of MacVector™ Sequence 
Analysis Software, scoring matrix pam250, hash value = 2; Kodak 

15 Scientific Imaging Systems. Rochester, NY) is shown in Fig. 4. 

On the score line of each panel in Fig. 4, up carats <") indicate 
homology or conservative amino acid changes, and down carats (v) 
indicate nonhomology. 

This analysis shows that the amino acid sequence of the TcbA 

20 peptide from residues 1739 to 1894 is highly homologous to amino 
acids 441 to 603 of the TcaBi peptide (162 of the total 627 amino 
acids of P8; SEQ ID NO:28) . in addition, the sequence of TcbA 
amino acids 1932 to 2459 is highly homologous to amino acids 12 
to 531 of peptide TcaBii (520 of the total 562 amino acids; SEQ 

25 ID NO:30). Considering that the TcbA peptide (SEQ ID NO:12) 

comprises 2505 amino acids, a total of 684 amino acids (27%) at 
the C-proximal end of it is homologous to the TcaBi or TcaBii 
peptides, and the homologies are arranged colinear to the 
arrangement of the putative TcaB preprotein (SEQ ID NO: 26) . A 

30 sizeable gap in the TcbA homology coincides with the junction 
between the TcaBi and TcaBii portions of the TcaB preprotein. 
Clearly the TcbA and TcaB gene products are evolutionarily 
related, and it is proposed that they share some common 
function(s) in Phocorhabdus. 
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Example 10 

Characterization of zinc-metalloproteases in Photorhabdus Broth: 
Protease Inhibition, Classification, and Purification 

5 Protease Inhibition and Classification Assays: Protease 

assays were performed using FITC-casein dissolved in water as 
substrate (0.08% final assay concentration). Proteolysis 
reactions were performed at 25°C for 1 h in the appropriate 
buffer with 25 ul of Photorhabdus broth (150 ul total reaction 

10 volume) . Samples were also assayed in the presence and absence 
of dithiothreitol . After incubation, an equal volume of 12% 
trichloroacetic acid was added to precipitate undigested protein. 
Following precipitation for 0.5 h and subsequent centrif ugation , 
100 ul of the supernatant was placed into a 96-well microtiter 

15 plate and the pH of the solution was adjusted by addition of an 
equal volume of 4N MaOH . Proteolysis was then quantitated using 
a Fluoroskan II fluorometric plate reader at excitation and 
emission wavelengths of 485 and 538 nm, respectively. Protease 
activity was tested over a range from pH 5.0-10.0 in 0.5 units 

20 increments. The following buffers were used at 50 mM final 

concentration: sodium acetate (pH 5.0 - 6.5); Tris-HCL (pH 7.0 - 
8.0); and bis-Tris propane (pH 8.5-10.0). To identify the class 
of protease(s) observed, crude broth was treated with a variety 
of protease inhibitors (0.5 ug/ul final concentration) and then 

25 examined for protease activity at pH 8.0 using the substrate 

described above. The protease inhibitors used included E-64 (L- 
trans-expoxysacciny lleucylamido ( 4 - , -guanidino] -butane ) , 3,4 
dichloroisocoumarin, Leupeptin, pepstatin. amastatin, 
ethylenediaminetetraacetic acid (EDTA) and 1,10 phenanthroline. 

30 Protease assays performed over a pH range revealed that 

indeed protease (s) were present which exhibited maximal activity 
at - pH 8.0 (Table 16). Addition of DTT did not have any effect 
on protease activity. Crude broth was then treated with a 
variety of protease inhibitors (Table 17). Treatment of crude 

35 broth with the inhibitors described above revealed that 1.10 
phenanthroline caused complete inhibition of all protease 
activity when added at a final concentration of 50 ug, with the 
IC50 = 5 ug in 100 ul of a 2 mg/ml crude broth solution. These 

data indicate that the most abundant protease(s) found in the 
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Fhocorhabdus broth are from the zinc-mecalloprotease class it 
enzymes . 



Table 16 

5 Effect of pH on the protease activity found in a Day 1 production 
of Photorhabdus luminescens (strain w-14). 



pH Flu. Units 3 Percent 



Act ivity 



IU 


5 . 


0 


3013 


+ 


78 


17 




5 . 


5 


7994 


± 


448 


45 


15 


6. 


0 


12965 


+ 


483 


74 




6. 


5 


14390 




1291 


82 




7 . 


0 


14386 




1287 


82 


20 


1 


5 


14135 




198 


80 




8 . 


0 


17582 




831 


100 


25 


8 . 


5 


16183 




953 


92 




9 . 


0 


16795 




760 


96 




9. 


5 


16279 




1022 


93 


30 


10 


.0 


15225 


+ 


210 


87 



a Flu. Units = Fluorescence Units (Maximum = -28,000; 



background = - 2200) . 

b. Percent activity relative to the maximum at pH 8.0 

35 
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Table 17 

Effect of different protease inhibitors on che protease a-ti-ir- 
at pH 8 found in a Day 1 production of Photorhabdus luminescent 

(strain W-14) . " " 



5 



25 



35 



40 



45 



Inhibitor Corrected Flu. Units* Peronr Tnh^7~b 



Control 13053 

E-64 14259 

10 1,10 Phenanthroline c 15 

3,4 Dichloroisocoumarin d 7956 

Leupeptin 13 07 4 

Pepstatin c 13441 

Amastatin 12474 

15 DMSO Control 12005 

Methanol Control 1212 5 



0 
0 

99 

39 
0 

0 
4 
8 

7 



a. Corrected Flu. Units = Fluorescence Units 

background (2200 flu. units). 

20 8 0 b ' Percent Inhibition relative to protease activity at pH 

c. Inhibitors were dissolved in methanol. 

d. Inhibitors were dissolved in DMSO. 



The isolation of a zinc-metalloprotease was performed by 
applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose 
column equilibrated at 50 mM Na 2 P0 4 , pH 7.0 as described in 
Example 5 for Photorhabdus toxin. After extensive washing, a 0 
to 0.5 M NaCl gradient was used to elute toxin protein. The 
majority of biological activity and protein was eluted from 0.15 
30 - 0.45 M NaCl. However, it was observed that the majority of 

proteolytic activity was present in the 0.25-0.35 M NaCl fraction 
with some activity in the 0.15-0.25 M NaCl fraction. SDS PAGE 
analysis of the 0.25-0.35 M NaCl fraction showed a major peptide 
band of approximately 60 kDa. The 0.15-0.25 M NaCl fraction 
contained a similar 60 kDa band but at lower relative protein 
concentration. Subsequent gel filtration of this fraction using 
a Superose 12 HR 16/50 column resulted in a major peak migrating 
at 57.5 kDa that contained a predominant (> 90% of total stained 
protein) 58.5 kDa band by SDS PAGE analysis. Additional analysis 
of this fraction using various protease inhibitors as described 
above determined that the protease was a zinc-metalloprotease. 
Nearly all of the protease activity present in Photorhabdus broth 
at day 1 of fermentation corresponded to the -58 kDa zinc- 
metalloprotease. 

In yet a second isolation of zinc -metalloprotease ( s ) , w-14 
Photorhabdus broth grown for three days was taken and protease 
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activity was visualized using sodium dodecyl sulfate- 
polyacry lamide gel electrophoresis (SDS-PAGE) laced with gelatin 
as described in Schmidt, T.M., Bleakley, B. and Nealson, K.M. 
1988. SDS running gels (5.5 x 3 cm) were made with 12.5 % 
5 polyacry lamide (40% stock solution of aery lamide/ bis -aery lamide ; 
Sigma Chemical Co., St. Louis. MO) into which 0.1% gelatin final 
concentration (Biorad EIA grade reagent; Richmond CA) was 
incorporated upon dissolving in water. SDS-stacking gels (1.0 x 
9 cm) were made with 5% polyacry lamide , also laced with 0.1% 

10 gelatin. Typically, 2.5 ug of protein to be tested was diluted 
in 0.03 ml of SDS-PAGE loading buffer without dithiothreitol 
( DTT) and loaded onto the gel. Proteins were electrophoresed in 
SDS running buffer (Laemmli, U.K. 1970. Nature 227, 680) at 0° C 
and at 8 rnA. After electrophoresis was complete, the gel was 

15 washed for 2 h in 2.S% (v/v) Triton X-100. Gels were then 

incubated for 1 h at 37 °c in 0.1 M glycine (pH 8.0). After 
incubation, gels were fixed and stained overnight with 0.1% amido 
black in methanol-acetic acid- water (30:10:60, vol . /vol . / vol . ; 
Sigma Chemical Co.). Protease activity was visualized as light 

20 areas against a dark, amido black stained background due to 

proteolysis and subsequent diffusion of incorporated gelatin. At 
least three distinct bands produced by proteolytic activity at 
58-, 41-, and 3 8 kDa were observed. 

Activity assays of the different proteases in W-14 day three 

25 culture broth were performed using FITC-casein dissolved in water 
as substrate (0.02% final assay concentration). Proteolysis 
experiments were performed at 37 °C for 0-0.5 h in 0 . 1M Tris-HCl 
(pH 8.0) with different protein fractions in a total volume of 
0.15 ml. Reactions were terminated by addition of an equal 

30 volume of 12% trichloroacetic acid (TCA) dissolved in water. 
After incubation at room temperature for 0.25 h, samples were 
centrifuged at 10,000 x g for 0.25 h and 0.10 ml aliquots were 
removed and placed into 96-well microtiter plates. The solution 
was then neutralized by the addition of an equal volume of 2 II 

35 sodium hydroxide, followed by quantitation using a Fluoror?kan II 
fluorometric plate reader with excitation and emission 
wavelengths of 485 and 538 nm, respectively. Activity 
measurements were performed using FITC-Casein with different 
protease concentrations at 37° C for 0-10 min. A unit of 
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activity was arbitrarily defined as the amount of enzyme needea 
to produce 1000 fluorescent units/min and specific activity was 
defined as units /mg of protease. 

Inhibition studies were performed using two zinc- 
5 metalloprotease inhibitors; 1,10 phenanthroline and N-(a- 

rhamnopyranosy loxyhydroxyphosphinyl ) -Leu-Trp (phosphoramidon ) with 
stock solutions of the inhibitors dissolved in 100% ethanol and 
water, respectively. Stock concentrations were typically 10 
mg/ml and 5 mg/ml for 1,10 phenanthroline and phosphoramidon, 

10 respectively, with final concentrations of inhibitor at 0.5-1.0 
mg/ml per reaction. Treatment of three day W-14 crude broth with 
1,10 phenanthroline, an inhibitor of all zinc metalloproteases . 
resulted in complete elimination of all protease activity while 
treatment with phosphoramidon, an inhibitor of thermolys in- like 

15 proteases (Weaver, L.H., Kester. W.R. , and Matthews, B.W. 1977. 
J. Mol. Biol. 114, 119-132), resulted in -56% reduction of 
protease activity. The residual proteolytic activity could not 
be further reduced with additional phosphoramidon. 

The proteases of three day W-14 Photorhabdus broth were 

20 purified as follows: 4.0 liters of broth were concentrated using 
an Amicon spiral ultra filtration cartridge Type S1Y100 attached 
to an Amicon M-12 filtration device. The flow-through material 
having native proteins less than 100 kDa in size (3.8 L) was 
concentrated to 0.375 L using an Amicon spiral ultra filtration 

25 cartridge Type S1Y10 attached to an Amicon M-12 filtration 

device. The retentate material contained proteins ranging in 
size from 10-100 kDa. This material was loaded onto a Pharmacia 
KR16/10 column which had been packed with PerSeptive Biosystem 
(Framington, MA) Poros® 50 HQ strong anion exchange packing that 

30 had been equilibrated in 10 mM sodium phosphate buffer (pH 7.0). 
Proteins were loaded on the column at a flow rate of 5 ml/min, 
followed by washing unbound protein with buffer until A2 80 = 
0.00. Afterwards, proteins were eluted using a NaCl gradient of 
0-1.0 M NaCl in 40 min at a flow rate of 7.5 ml/min. Fractions 

35 were assayed for protease activity, supra., and active fractions 
were pooled. Proteolytically active fractions were diluted with 
50% (v/v) 10 mM sodium phosphate buffer (pH 7.0) and loaded onto 
a Pharmacia HR 10/10 Mono Q column equilibrated in 10 mM sodium 
phosphate. After washing the column with buffer until A28O = 
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0.00, proteins were eluted using a MaCl gradient of 0-0.5 M tJaCl 
for 1 h at a flow rate of 2.0 ml/min. Fractions were assayed for 
protease activity. Those fractions having the greatest amount of 
phosphoramidon-sensit ive protease activity, the phosphoramidon 
5 sensitive activity being due to the 41/38 kDa protease, infra. , 
were pooled. These fractions were found to elute at a range of 
0.15-0.25 M NaCl. Fractions containing a predominance of 
phosphoramidon-insensitive protease activity, the 58 kDa 
protease, were also pooled. These fractions were found to elute 

10 at a range of 0.25-0.35 M NaCl. The phosphoramidon-sens it ive 
protease fractions were then concentrated to a final volume of 
0.75 ml using a Millipore Ultraf ree®- 15 centrifugal filter device 
Biomax-5K NMWL membrane. This material was applied at a flow 
rate of 0.5 ml/min to a Pharmacia HR 10/30 column that had been 

15 packed with Pharmacia Sephadex G-50 equilibrated in 10 mM sodium 
phosphate buffer (pH 7.0)/ 0.1 M NaCl. Fractions having the 
maximal phosphoramidon-sensitive protease activity were then 
pooled and centrifuged over a Millipore Ultraf ree®- 15 centrifugal 
filter device Biomax-50K NMWL membrane. Proteolytic activity 

20 analysis, supra., indicated this material to have only 

phosphoramidon-sensitive protease activity. Pooling of the 
phosphoramidon-insensitive protease, the 58 kDa protein, was 
followed by concentrating in a Millipore Ultraf ree<S>- 15 
centrifugal filter device Biomax-50K NMWL membrane and further 

25 separation on a Pharmacia Superdex-75 column. Fractions 
containing the protease were pooled. 

Analysis of purified 58- and 41/38 kDa purified proteases 
revealed that, while both types of protease were completely 
inhibited with 1,10 phenanthroline , only the 41/38 kDa protease 

30 was inhibited with phosphoramidon. Further analysis of crude 
broth indicated that protease activity of day 1 W-14 broth has 
23% of the total protease activity due to the 41/38 kDa protease, 
increasing to 44% in day three W-14 broth. 

Standard SDS-PAGE analysis for examining protein purity and 

35 obtaining amino terminal sequence was performed using 4-20% 

gradient MiniPlus SepraGels purchased from Integrated Separation 
Systems (Natick, MA) . Proteins to be amino-terminal sequenced 
were blotted onto PVDF membrane following purification, infra.. 
(ProBlott 1 " Membranes; Applied Biosystems, Foster City, CA) , 
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visualized with 0.1% amido black, excised, and sent co Cambridge 
prochem; Cambridge, MA, for sequencing. 

Deduced amino terminal sequence of the 58- (SEQ ID MO: 45) 
and 41/38 kDa (SEQ ID NO: 44) proteases from three day old w-14 
5 broth were DV-GSEKANEKLK (SEQ ID NO: 45) and DSGDDDKVTNTDIHR (SEQ 
ID MO:44), respectively. 

Sequencing of the 41/38 kDa protease revealed several amino 
termini, each one having an additional amino acid removed by 
proteolysis. Examination of the primary, secondary, tertiary and 
10 quartenary sequences for the 38 and 41 kDa polypeptides allowed 

for deduction of the sequence shown above and revealed that these 
two proteases are homologous. 

Example 11, Part A 
15 Screening of Photorhabdus Genomic Library via use of Antibodies 

for Genes encoding TcbA Peptide 



In parallel to the sequencing described above, suitable 
probing and sequencing was done based on the TcbAii peptide (SEQ 
20 ID NO:l). This sequencing was performed by preparing bacterial 

culture broths and purifying the toxin as described in Examples 1 
and 2 above. 

Genomic DNA was isolated from the Photorhabdus luminescens 
strain W-14 grown in Grace's insect tissue culture medium. The 

25 bacteria were grown in 5 ml of culture medium in a 2 50 ml 

Erleruneyer flask at 28°C and 250 rpm for approximately 24 hours. 
Bacterial cells from 100 ml of culture medium were pelleted at 
5000 x g for 10 minutes. The supernatant was discarded, and the 
cell pellets then were used for the genomic DNA isolation. 

30 The genomic DNA was isolated using a modification of the 

CTAB method described in Section 2.4.3 of Ausubel [supra.). The 
section entitled "Large Scale CsCl prep of bacterial genomic DMA" 
was followed through step 6. At this point, an additional 
chlorof orm/isoamyl alcohol (24:1) extraction was performed 

35 followed by a phenol/chlorof orm/isoamyl (25:24:1) extraction step 
and a final chlorof orm/ isoamyl/alcohol (24:1) extraction. The 
DNA was precipitated by the addition of a 0.6 volume of 
isopropanol. The precipitated DNA was hooked and wound around 
the end of a bent glass rod, dipped briefly into 70% ethanol as a 

40 final wash, and dissolved in 3 ml of TE buffer. 
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The DMA concentration, estimated by optical density at 
2 30/2 60 nm, was approximately 2 mg/ml. 

Using this genomic DNA , a library was prepared. 
Approximately 50 ug of genomic DNA was partly digested with Sau3 
5 Al. Then NaCl density gradient centrif ugation was used to size 
fractionate the partially digested DNA fragments. Fractions 
containing DMA fragments with an average size of 12 kb, or 
larger, as determined by agarose gel electrophoresis, were 
ligated into the plasmid BluScript, Stratagene, La Jolla. 
10 California, and transformed into an E. coli DH5a or DHB10 strain. 

Separately, purified aliquots of the protein were sent to 
the biotechnology hybridoma center at the University of 
Wisconsin, Madison for production of monoclonal antibodies to the 
proteins. The material that was sent was the HPLC purified 
15 fraction containing native bands 1 and 2 which had been denatured 
at 65°C, and 20 \ig of which was injected into each of four mice. 
Stable monoclonal antibody-producing hybridoma cell lines were 
recovered after spleen cells from unimmunized mouse were fused 
with a stable myeloma cell line. Monoclonal antibodies were 
20 recovered from the hybridomas. 

Separately, polyclonal antibodies were created by taking 
native agarose gel purified band 1 (see Example 1) protein which 
was then used to immunize a New Zealand white rabbit. The 
protein was prepared by excising the band from the native agarose 
25 gels, briefly heating the gel pieces to 65°C to melt the agarose, 
and immediately emulsifying with adjuvant. Freund's complete 
adjuvant was used for the primary immunizations and Freund's 
incomplete was used for 3 additional injections at monthly 
intervals. For each injection, approximately 0.2 ml of 
30 emulsified band 1, containing 50 to 100 micrograms of protein, 

was delivered by multiple subcontaneous injections into the back 
of the rabbit. Serum was obtained 10 days after the final 
injection and additional bleeds were performed at weekly 
intervals for 3 weeks. The serum complement was inactivated by 
35 heating to 56°C for 15 minutes and then stored at -20°C. 

The monoclonal and polyclonal antibodies were then used to 
screen the genomic library for the expression of antigens which 
could be detected by the epitope. Positive clones were detected 
on nitrocellulose filter colony lifts. An immunoblot analysis of 
40 the positive clones was undertaken. 
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An analysis of the clones as defined by both immunoblot anc t 
Southern analysis resulted in the tentative identification of 
five classes of clones. 

In the first class of clone was a gene encoding the peptide 
5 designated here as TcbAij.. Full DNA sequence of this gene [TcbA) 
was obtained. It is set forth as SEQ ID NO: 11. Confirmation 
that the sequence encodes the internal sequence of SEQ ID tIO:l is 
demonstrated by the presence of SEQ ID N0:1 at amino acid number 
88 from the deduced amino acid sequence created by the open 
10 reading frame of SEQ ID NO: 11. This can be confirmed by 
referring to SEQ ID NO: 12, which is the deduced amino acid 
sequence created by SEQ ID NO: 11. 

The second class of toxin peptides contains the segments 
referred to above as TcaBi, TcaBiiand TcaC. Following the 
15 screening of the library with the polyclonal antisera, this 
second class of toxin genes was identified by several clones 
which produced different size proteins, all of which cross- 
reacted with the polyclonal antibody on an immunoblot and were 
also found to share DNA homology on a Southern Blot. Sequence 
20 comparison revealed that they belonged to the gene complex 
designated TcaB and TcaC above. 

Three other classes of antibody toxin clones were also 
isolated in the polyclonal screen. These classes produced 
proteins that cross-react with a polyclonal antibody and also 
25 shared DNA homology with the classes as determined by Southern 
blotting. The classes have been designated Class III, Class IV 
and Class V. It was also possible to identify monoclonals that 
cross-reacted with Class I, II, III, and IV. This suggests that 
all have regions of high protein homology. Thus, it appears that 
30 the P. luminescens extracellular protein genes represent a family 
of genes which are evolutionarily related. 

To further pursue the concept that there might be 
evolutionarily related variations in the toxin peptides contained 
within this organism, two approaches have been undertaken to 
35 examine other strains of P. luminescens for the presence of 

related proteins. This was done both by PCR amplification of 
genomic DNA and by immunoblot analysis using the polyclonal and 
monoclonal antibodies. 
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The results indicate that related proteins are produced by 
P. Iuminescens strains WX-2, WX-3, WX-4, WX-5, WX-6, WX-"\ WX- ? 
WX-11. WX-12, WX-15 and W-14. 

5 Example 11, Part B 

Sequence and anaylsis of Class III toxin clones - tec 

Further DNA sequencing was performed on plasmids isolated 
from Class III E. coli clones described in Example 11, Part A. 
The nucleotide sequence was shown to be three closely linked open 
reading frames at this genomic locus. This locus was designated 
tec with the three open reading frames designated tccA SEQ ID 
NO:56, cccB SEQ ID NO:58 and tccC SEQ ID NO:60 (Fig. 6B). 

The deduced amino acid from the tccA open reading frame 
15 indicates the gene encodes a protein of 105,459 Da. This protein 
was designated TccA. The first 12 amino acids of this protein 
match the N-terminal sequence obtained from a 108 kDa protein, 
SEQ ID NO: 7, previously identified as part of the toxin complex. 
The deduced amino acid from the tccB open reading frame 
20 indicates this gene encodes a protein of 175,716 Da. This 

protein was designated TccB. The first 11 amino acids of this 
protein match the N-terminal sequence obtained from a protein 
with estimated molecular weight of 185 kDa. SEQ ID NO: 8. 

The deduced amino acid sequence of tccC indicated that this 
25 open reading frame encodes a protein of 111,694 Da and the 
protein product was designated TccC. 



Example 12 

Characterization of Photorhabdus Strains 

30 

In order to establish that the collection described herein 
was comprised of Photorhabdus strains, the strains herein were 
assessed in terms of recognized microbiological traits that are 
characteristic of Photorhabdus and which differentiate it from 
35 other Enterobacceriaceae and Xenorhabdus spp. (Farmer, J.J. 1984. 
Bergey's Manual of Systemic Bacteriology, vol 1. pp. 510-511. 
led. Kreig N.R. and Holt, J.G.). Williams & Wilkins, Baltimore.; 
Akhurst and Boemare, 1988, Boemare et al . , 1993). These 
characteristic traits are as follows: Gram's stain negative 
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rods, organism size of 0.5-2 um in width and 2-10 urn in ienqth. 
red/yellow colony pigmentation, presence of crystalline inclusion 
bodies, presence of catalase, inability to reduce nitrate, 
presence of bioluminescence , ability to take up dye from growth 
media, positive for protease production, growth-temperature range 
below 37°C, survival under anaerobic conditions and positively 
motile. (Table 18). Reference Escherichia coli, Xenorhabdus and 
Photorhabdus strains were included in all tests for comparison. 
The overall results are consistent with all strains being part of 
the family Encerobacteriaceae and the genus Photorhabdus . 

A luminometer was used to establish the bioluminescence of 
each strain and provide a quantitative and relative measurement 
of light production. For measurement of relative light emitting 
units, the broths from each strain (cells and media) were 
measured at three time intervals after inoculation in liquid 
culture (6, 12. and 2 4 hr) and compared to background luminosity 
(uninoculated media and water) . Prior to measuring light 
emission from the various broths, cell density was established by- 
measuring light absorbance (560 nM) in a Gilford Systems 
(Oberlin, OH) spectrophotometer using a sipper cell. Appropriate 
dilutions were then made (to normalize optical density to 1.0 
unit) before measuring luminosity. Aliquots of the diluted 
broths were then placed into cuvettes (300 ul each) and read in a 
Bio-Orbit 1251 Luminometer (Bio-Orbit Oy, Twiku, Finland) . The 
integration period for each sample was 45 seconds. The samples 
were continuously mixed (spun in baffled cuvettes) while being 
read to provide oxygen availability. A positive test was 
determined as being £ 5-fold background luminescence (-5-10 
units). In addition, colony luminosity was detected with 
photographic film overlays and visually, after adaptation in a 
darkroom. The Gram's staining characteristics of each strain 
were established with a commercial Gram's stain kit (BBL, 
Cockeysville, MD) used in conjunction with Gram's stain control 
slides (Fisher Scientific, Pittsburgh, PA) . Microscopic 
evaluation was then performed using a Zeiss microscope (Carl 
Zeiss, Germany) 100X oil immersion objective lens (with 10X 
ocular and 2X body magnification) . Microscopic examination of 
individual strains for organism size, cellular description and 
inclusion bodies (the latter after logarithmic growth) was 
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performed using wet mount slides ( 10X ocular, 2X body and -40X 
objective magnification) with oil immersion and phase contrast 
microscopy with a micrometer (Akhurst, R.J. and Boemare, N.E. 
1990. Entomopathogenic Nematodes in Biological Control ( ed . 
5 Gaugler, R. and Kaya, H.). pp. 75-90. CRC Press, Boca Raton, 

USA.; Baghdiguian S., Boyer-Giglio M.H., Thaler, J.O., Bonnot G . 
Boemare II. 199 3. Biol. Cell 7?, 177-185.). Colony pigmentation 
was observed after inoculation on Bacto nutrient agar, (Difco 
Laboratories, Detroit, MI) prepared as per label instructions. 
0 Incubation occurred at 28°C and descriptions were produced after 
5-7 days. To test for the presence of the enzyme catalase, a 
colony of the test organism was removed on a small plug from a 
nutrient agar plate and placed into the bottom of a glass test 
tube. One ml of a household hydrogen peroxide solution was gently 
5 added down the side of the tube. A positive reaction was 
recorded when bubbles of gas (presumptive oxygen) appeared 
immediately or within 5 seconds. Controls of uninoculated 
nutrient agar and hydrogen peroxide solution were also examined. 
To test for nitrate reduction, each culture was inoculated into 
10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI). 
After 24 hours incubation at 28°C, nitrite production was tested 
by the addition of two drops of sulfanilic acid reagent and two 
drops of alpha-naphthy lamine reagent (see Difco Manual, 10th 
edition. Difco Laboratories, Detroit, MI, 1984). The generation 
of a distinct pink or red color indicates the formation of 
nitrite from nitrate. The ability of each strain to uptake dye 
from growth media was tested with Bacto MacConkey agar containing 
the dye neutral red; Bacto Tergitol-7 agar containing the dye 
bromothymol blue and Bacto EMB Agar containing the dye eosin-Y 
(agars from Difco Laboratories, Detroit, MI, all prepared 
according to label instructions) . After inoculation on these 
media, dye uptake was recorded after incubation at 28°C for 5 
days. Growth on these latter media is characteristic for members 
of the family Encerobacceriaceae. Motility of each strain was 
tested using a solution of Bacto Motility Test Medium (Difco 
Laboratories, Detroit, MI) prepared as per label instructions. A 
butt-stab inoculation was performed with each strain and motility 
was judged macroscopically by a diffuse zone of growth spreading 
from the line of inoculum. In many cases, motility was also 
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reserved microscopically from liquid culture under wee mount 
slides. Biochemical nutrient evaluation for each strain was 
performed using BBL Enterotube II (Benton, Dickinson, Germany > . 
Product instructions were followed with the exception that 
5 incubation was carried out at 28°C for 5 days. Results were 
consistent with previously cited reports for Photorhabdus . The 
production of protease was tested by observing hydrolysis of 
gelatin using Bacto gelatin (Difco Laboratories, Detroit, MI) 
plates made as per label instructions. Cultures were inoculated 

10 and the plates were incubated at 28°C for 5 days. To assess 
growth at different temperatures, agar plates (2% proteose 
peptone #3 with two percent Bacto-Agar (Difco, Detroit, MI) in 
deionized water] were streaked from a common source of inoculum. 
Plates were sealed with Nesco® film and incubated at 20, 28 and 

15 17°C for up to three weeks. Plates showing no growth at 37°c 
showed no cell viability after transfer to a 28°C incubator for 
one week. Oxygen requirements for Photorhabdus strains were 
tested in the following manner. A butt-stab inoculation into 
fluid thioglycolate broth medium (Difco, Detroit, MI) was made. 

2U The tubes were incubated at room temperature for one week and 
cultures were then examined for type and extent of growth. The 
indicator resazurin demonstrates the level of medium oxidation or 
the aerobiosis zone (Difco Manual, 10th edition, Difco 
Laboratories, Detroit, MI). Growth zone results obtained for the 

25 Photorhabdus strains tested were consistent with those of a 
facultative anaerobic microorganism. 

Table 18 

Taxonomic Traits of Photorhabdus Strains 

30 

Traits Assessed* 
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* - A = Gram's stain, B=Crystaline inclusion bodies, 



C=Bioluminescence, D=Cell form, E=Motility, F=Nitrate reduction, 
G=Presence of catalase, H=Gelatin hydrolysis, I=Dye uptake, 
J=Pigmentation. K=Growth on EMB agar, L=Growth on MacConkey agar, 
5 M=Growth on Tergitol-7 agar, N=Facultative anaerobe, 0=Growth at 
20°C, P=Growth at 28°C, Q=Growth at 37°C, t - +/- = positive or 
negative for trait, rd=rod, S=sized within Genus descriptors, 
RO=red-orange , LR = light red, R= red, 0= organge, Y= yellow, T= 
tan, LY= light yellow, YT= yellow tan, and LO= light oranae. 

0 

Cellular fatty acid analysis is a recognized tool for 
bacterial characterization at the genus and species level 
(Tornabene, T.G. 1985. Lipid Analysis and the Relationship to 

-84- 



SUBST1TUTE SHEET (RULE 26) 



WO 97/17432 



PCT/US96/18003 



Chemctaxoncmy in Methods in Microbiology , Vol 18, 2CH-2.-4.; 
Goodfellow, M. and O'Donnell, A.G. 1993. Roots of Bacterial 
Systematics in Handbook of Mew Bacterial Systematics (ed . 
Goodfellow, M. 4 O-Donnell, A.G.) pp. 3-54. London: Academic 
5 Press Ltd. ) , these references are incorporated herein by 

reference, and were used to confirm that our collection was 
related at the genus level. Cultures were shipped to an 
external, contract laboratory for fatty acid methyl ester 
analysis (FAME) using a Microbial ID (MIDI, Newark, DE, USA! 

10 Microbial Identification System (MIS). The MIS system consists of 
a Hewlett Packard HP5890A gas chromatography with a 25mm x 0.2mm 
5% methylphenyl silicone fused silica capillary column. Hydrogen 
is used as the carrier gas and a flame- ionizat ion detector 
functions in conjunction with an automatic sampler, integrator 

15 and computer. The computer compares the sample fatty acid methyl 
esters to a microbial fatty acid library and against a 
calibration mix of known fatty acids. As selected by the 

o 

contract laboratory, strains were grown for 24 hours at 28 C on 
trypticase soy agar prior to analysis. Extraction of samples was 

20 performed by the contract lab as per standard FAME methodology. 
There was no direct identification of the strains to any 
luminescent bacterial group other than Photorhabdus . When the 
cluster analysis was performed, which compares the fatty acid 
profiles of a group of isolates, the strain fatty acid profiles 

25 were related at the genus level. 

The evolutionary diversity of the Photorhabdus strains in 
our collection was measured by analysis of PCR (Polymerase Chain 
Reaction) mediated genomic fingerprinting using genomic DNA from 
each strain. This technique is based on families of repetitive 

30 DNA sequences present throughout the genome of diverse bacterial 
species (reviewed by Versal ovic , J . , Schneider , M . , DE Brui jn , 
F.J. and Lupski. J.R. 1994. Methods Mol. Cell. Biol., 5, 25-40.). 
Three of these, repetitive extragenic palindromic sequence (REP) , 
enterobacterial repetitive intergenic consensus (ERIC) and the 

35 BOX element are thought to play an important role in the 

organization of the bacterial genome. Genomic organization is 
believed to be shaped by selection and the differential 
dispersion of these elements within the genome of closely related 
bacterial strains can be used to discriminate these strains (e.g. 
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Louws, F.J., Fulbright, D.W., Stephens, C.T. and DE Brui^n. F.J. 
1994. Appl. Environ. Micro. 60, 2286-2295.1. Rep-PCR utilizes 
oligonucleotide primers complementary to these repetitive 
sequences to amplify the variably sized DNA fragments lying 
5 between them. The resulting products are separated by 

electrophoresis to establish the DNA "fingerprint" for each 
strain . 

To isolate genomic DNA from our strains, cell pellets were 
resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0} to a 

10 final volume of 10 ml and 12 ml of 5 M NaCl was then added. This 
mixture was centrifuged 20 min. at 15,000 x g. The resulting 
pellet was resuspended in 5.7 ml of TE and 300 ul of 10% SDS and 
60 ul 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, 
NY) were added. This mixture was incubated at 37 °c for 1 hr, 

15 approximately 10 mg of lysozyme was then added and the mixture 

was incubated for an additional 45 min. One milliliter of 5M NaCl 
and 800 ul of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were 
then added and the mixture was incubated 10 min. at 65°C, gently 
agitated, then incubated and agitated for an additional 20 min. 

20 to aid in clearing of the cellular material. An equal volume of 
chloroform/ isoamyl alcohol solution (24:1, v/v) was added, mixed 
gently then centrifuged. Two extractions were then performed with 
an equal volume of phenol/chlorof orm/isoamy 1 alcohol (50:49:1). 
Genomic DNA was precipitated with 0.6 volume of isopropanol. 

25 Precipitated DNA was removed with a glass rod, washed twice with 
70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl 
pH8.0, 10 mM NaCl, 1 mM EDTA). The DNA was then quantitated by 
optical density at 260 nm. To perform rep-PCR analysis of 
Photorhabdus genomic DNA the following primers were used, REP1R- 

30 I; 5 ' -IIIICGICGICATCIGGC-3 ' andREP2-I; 5 ' - ICGICTTATCIGGCCTAC- 3 ' . 
PCR was performed using the following 25ul reaction: 7.75 ul H2O, 

2.5 ul 10X LA buffer (PanVera Corp., Madison, WI > , 16 ul dNTP mix 
(2.5 mM each), 1 ul of each primer at 50 pM/ul, 1 ul DMSO, 1.5 ul 
genomic DNA (concentrations ranged from 0.075-0.480 ug/ul) and 
35 0.25 ul TaKaRa EX Taq (PanVera Corp., Madison, WI). The PCR 

amplification was performed in a Perkin Elmer DNA Thermal Cycler 
(Norwalk, CT) using the following conditions: 95°C/7 min. then 35 
cycles of; 94°C/1 min.,44°C/l min., 65°C/8 min., followed by 15 
min. at 65°C. After cycling, the 25 ul reaction was added to 5 ul 
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of 5X gel loading buffer (0.25% bromophenol blue, 40% v v sucrose 
in H2O) . A 15x20cm 1%-agarose gel was then run in TBE buffer 

(0.09 M Tris-borate, 0.002 M EDTA) using 8 ul of each reaction. 
The gel was run for approximately 16 hours at 45v. Gels were then 
5 stained in 20 ug,-ml ethidium bromide for 1 hour and destained in 
TBE buffer for approximately 3 hours. Polaroid® photographs of 
the gels were then taken under UV illumination. 

The presence or absence of bands at specific sizes for each 
strain was scored from the photographs and entered as a 

10 similarity matrix in the numerical taxonomy software program, 
NTSYS-pc (Exeter Software, Setauket, NY) . Controls of £. coii 
strain HB101 and Xanthomonas oryzae pv. oryzae assayed at the 
same time produced PCR "fingerprints" corresponding to published 
reports (Versalovic, J., Koeuth, T. and Lupski, J.R. 1991. 

15 Nucleic Acids Res. 19, 6823-6831; Vera Cruz, CM., Halda-Alija, 
L. , Louws , F . , Skinner, D.2. , George, M.L., Nelson, R.J., DE 
Bruijn, F. J. , Rice, C. and Leach, J.E. 1995. Int. Rice Res. 
Notes, 20, 23-24.; Vera Cruz, CM., Ardales, E.Y., Skinner, D.Z., 
Talag, J., Nelson, R. J. , Louws, F.J. , Leung, H. , Mew, T.W. and 

20 Leach, J.E. 1996. Phytopathology (in press, respectively). The 
data from Photorhabdus strains were then analyzed with a series 
of programs within NTSYS-pc; SIMQUAL (Similarity for Qualitative 
data) to generate a matrix of similarity coefficients (using the 
Jaccard coefficient) and SAHN (Sequential, Agglomerative , 

25 Heirarchical and Nested) clustering [using the UPGMA (Unweighted 
Pair-Group Method with Arithmetic Averages) method) which groups 
related strains and can be expressed as a phenogram (Figure 5). 
The COPH (cophenetic values) and MXCOMP (matrix comparison) 
programs were used to generate a cophenetic value matrix and 

30 compare the correlation between this and the original matrix upon 
which the clustering was based. A resulting normalized Mantel 
statistic (r) was generated which is a measure of the goodness of 
fit for a cluster analysis (r=0.8-0.9 represents a very good 
fit). In our case r = 0.919. Therefore, our collection is 

35 comprised of a diverse group of easily distinguishable strains 
representative of the Photorhabdus genus. 
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Example 13 

Insecticidal Utility of Toxin is) Produced 
by Various Fhocorhabdus Strains 

5 Initial "seed" cultures of the various Photorhabdus strains 

were produced by inoculating 175 ml of 2% Proteose Peptone «3 
(PP3) (Difco Laboratories, Detroit, MI) liquid media with a 
primary variant subclone in a 500 ml tribaffled flask with a 
Delong neck, covered with a Kaput. Inoculum for each seed culture 
10 was derived from oil -overlay agar slant cultures or plate 

cultures. After inoculation, these flasks were incubated for 16 
hrs at 28°C on a rotary shaker at 150 rpm. These seed cultures 
were then used as uniform inoculum sources for a given 
fermentation of each strain. Additionally, overlaying the post- 
15 log seed culture with sterile mineral oil. adding a sterile 

magnetic stir bar for future resuspension and storing the culture 
in the dark, at room temperature provided long-term preservation 
of inoculum in a toxin-competent state. The production broths 
were inoculated by adding 1% of the actively growing seed culture 

20 to fresh 2% PP3 media (e.g. 1.75 ml per 175 ml fresh media). 

Production of broths occurred in either 500 ml tribaffled flasks 
(see above) . or 2800 ml baffled, convex bottom flasks (500 ml 
volume) covered by a silicon foam closure. Production flasks 
were incubated for 24-48 hrs under the above mentioned 

25 conditions. Following incubation, the broths were dispensed into 
sterile 1 L polyethylene bottles, spun at 2600 x g for 1 hr at 
10°C and decanted f rom the cell and debris pellet. The liquid 
broth was then vacuum filtered through Whatman GF/D (2.7 uM 
retention) and GF/B (1.0 uM retention) glass filters to remove 

30 debris. Further broth clarification was achieved with a 
tangential flow microf iltration device (Pall Filtron, 
Northborough, MA) using a 0.5 uM open-channel filter. When 
necessary, additional clarification could be obtained by chilling 
the broth (to 4°C) and centrifuging for several hours at 2600 x 

35 g. Following these procedures, the broth was filter sterilized 
using a 0.2 uM nitrocellulose membrane filter. Sterile broths 
were then used directly for biological assay, biochemical 
analysis or concentrated (up to 15-fold) using a 10,000 mw cut- 
off, M12 ultra-filtration device (Amicon, Beverly MA) or 
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centrifugal concentrators (Millipore, Bedford, MA and Pali 
Filtron, Northborough. MA) with a 10,000 MW pore size. in tne 
case of centrifugal concentrators, the broth was spun at 2000 x g 
for approximately 2 hr . The 10,000 MW permeate was added to the 
5 corresponding retentate to achieve the desired concentration of 
components greater than 10,000 MW. Heat inactivation of 
processed broth samples was acheived by heating the samples at 
100°C in a sand-filled heat block for 10 minutes. 

The broth(s) and toxin complexlesj from different 
Phocorhabdus strains are useful for reducing populations of 
insects and were used in a method of inhibiting an insect 
population which comprises applying to a locus of the insect an 
effective insect inactivating amount of the active described. A 
demonstration of the breadth of insecticidal activity observed 
from broths of a selected group of Phocorhabdus strains fermented 
as described above is shown in Table 19. It is possible that 
additional insecticidal activities could be detected with these 
strains through increased concentration of the broth or by 
employing different fermentation methods. Consistent with the 
activity being associated with a protein, the insecticidal 
activity of all strains tested was heat labile (see above) . 

Culture broth(s) from diverse Phocorhabdus strains show 
differential insecticidal activity (mortality and/or growth 
inhibition, reduced adult emergence) against a number of insects. 
More specifically, the activity is seen against corn rootworm 
larvae and boll weevil larvae which are members of the insect 
order Coleopcera . other members of the Coleopcera include 
wireworms, pollen beetles, flea beetles, seed beetles and 
Colorado potato beetle. Activity is also observed against aster 
leaf hopper and corn plant hopper, which are members of the order 
Homoptera. Other members of the Homoptera include planthoppers , 
pear psylla, apple sucker, scale insects, whiteflies, spittle 
bugs as well as numerous host specific aphid species. The broths 
and purified toxin complex (es) are also active against tobacco 
35 budworm, tobacco hornworm and European corn borer which are 

members of the order Lepidopcera. Other typical members of this 
order are beet armyworm, cabbage looper, black cutworm, corn 
earworm, codling moth, clothes moth, Indian mealmoth, leaf 
rollers, cabbage worm, cotton bollworm. bagworm. Eastern tent 
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caterpillar, sod webworm and fall armyworm. Activity is also 
seen against fruitfly and mosquito larvae which are members oi 
the order Diptera. Other members of the order Diptera are, pea 
midge, carrot fly, cabbage root fly, turnip root fly, onion fly, 
5 crane fly and house fly and various mosquito species. Activity 
with broth(s) and toxin complex(es) is also seen against two- 
spotted spider mite which is a member of the order Acarina which 
includes strawberry spider mites, broad mites, citrus red mite, 
European red mite, pear rust mite and tomato russet mite. 

10 Activity against corn rootworm larvae was tested as follows. 

Photorhabdus culture broth(s) (0-15 fold concentrated, filter 
sterilized), 2% Proteose Peptone #3, purified toxin complex(es) 
[0.23 mg/ml] or 10 mM sodium phosphate buffer , pH 7 . 0 were 
applied directly to the surface (about 1.5 cm 2 ) of artificial 

15 diet (Rose, R. I. and McCabe, J, M . (1973). J. Econ. Entomol. 66, 
(398-400) in 40 ul aliquots. Toxin complex was diluted in 10 mM 
sodium phosphate buffer, pH 7.0. The diet plates were allowed to 
air-dry in a sterile flow-hood and the wells were infested with 
single, neonate Diabrotica undecimpunctata howardi (Southern corn 

20 rootworm, SCR) hatched from surface sterilized eggs. The plates 
were sealed, placed in a humidified growth chamber and maintained 
at 27°C for the appropriate period (3-5 days) . Mortality and 
larval weight determinations were then scored. Generally, 16 
insects per treatment were used in all studies. Control 

25 mortality was generally less than 5%. 

Activity against boll weevil {Anthomonas grandis) was tested 
as follows. Concentrated (1-10 fold) Photorhabdus broths, 
control medium (2% Proteose Peptone #3), purified toxin 
complex (es) [0.23 mg/ml} or 10 mM sodium phosphate buffer, pH 7.0 

30 were applied in 60 ul aliquots to the surface of 0.35 g of 

artificial diet (Stoneville Yellow lepidopteran diet) and allowed 
to dry. A single, 12-24 hr boll weevil larva was placed on the 
diet, and the wells v/ere sealed and held at 25°C, 50% RH for 5 
days. Mortality and larval weights were then assessed. Control 

35 mortality ranged between 0-13%. 

Activity against mosquito larvae was tested as follows. The 
assay was conducted in a 96-well microtiter plate. Each well 
contained 200 ul of aqueous solution (10-fold concentrated 
Photorhabdus culture broth(s), control medium (2% Proteose 
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Peptone 43), 10 mM sodium phosphate buffer, toxin complex(es) •? 
0.23 mg/ml or H2O) and approximately 20, 1-day old larvae >Aedes 
aegypti ) . There were 6 wells per treatment. The results were 
read at 3-4 days after infestation. Control mortality was 
5 between 0-20%. 

Activity against fruitflies was tested as follows. 
Purchased Drosophi la melanogaster medium was prepared using 50% 
dry medium and a 50% liquid of either water, control medium (2% 
Proteose Peptone #3), 10-fold concentrated Photorhabdus culture 

10 broth(s) , purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium 
phosphate buffer , pH 7.0. This was accomplished by placing 4.0 
ml of dry medium in each of 3 rearing vials per treatment and 
adding 4.0 ml of the appropriate liquid. Ten late instar 
Drosophila melanogaster maggots were then added to each 25 ml 

15 vial. The vials were held on a laboratory bench, at room 

temperature, under fluorescent ceiling lights. Pupal or adult 
counts were made after 15 days of exposure. Adult emergence as 
compared to water and control medium (0-16% reduction). 

Activity against aster leafhopper adults {Macrosteles 

20 severini) and corn planthopper nymphs (Peregrinus maidis) was 

tested with an ingestion assay designed to allow ingestion of the 
active without other external contact. The reservoir for the 
active/ " food" solution is made by making 2 holes in the center of 
the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm 

25 M® square is placed across the top of the dish and secured with 
an "O" ring. A 1 oz . plastic cup is then infested with 
approximately 7 hoppers and the reservoir is placed on top of the 
cup. Parafilm down. The test solution is then added to the 
reservoir through the holes. In tests using 10-fold concentrated 

30 Photorhabdus culture broth(s), the broth and control medium (2% 
Proteose Peptone #3) were dialyzed against 10 mM sodium phosphate 
buffer, pH 7.0 and sucrose (to 5%) was added to the resulting 
solution to reduce control mortality. Purified toxin complex(es) 
[0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 was also 

35 tested. Mortality is reported at day 3. The assay was held in 
an incubator at 28°C, 70% RH with a 16/8 photoperiod. The assays 
were graded for mortality at 72 hours. Control mortality was 
less than 6%. 
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Activity against lepidcptsran larvae was tested as fellows. 
Concentrated (10-fold) Phocorhabdus culture broth(s), control 
medium (2% Proteose Peptone #3), purified toxin complex(esi (0.23 
mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied 
5 directly to the surface (-1.5 cm 2 ) of standard artificial 

lepidopteran diet (Stoneville Yellow diet) in 40 ul aliquots. 
The diet plates were allowed to air-dry in a sterile flow-hood 
and each well was infested with a single, neonate larva. European 
corn borer (Ostrinia nubilalis) and tobacco hornworm {Manduca 

10 sexta) eggs were obtained from commercial sources and hatched in- 
house, whereas tobacco budworm (.Heliochis virescens) larvae were 
supplied internally. Following infestation with larvae, the diet 
plates were sealed, placed in a humidified growth chamber and 
maintained in the dark at 27°c for the appropriate period. 

15 Mortality and weight determinations were scored at day 5. 

Generally, 16 insects per treatment were used in all studies. 
Control mortality generally ranged from 4-12.5% for control 
medium and was less than 10% for phosphate buffer. 

Activity against two-spotted spider mite (Tetranychus 

20 urticae) was determined as follows. Young squash plants were 

trimmed to a single cotyledon and sprayed to run-off with 10- fold 
concentrated broth(s), control medium (2% Proteose Peptone #3), 
purified toxin complex(es) [0.23 mg/ml) or 10 mM sodium phosphate 
buffer, pK 7.0. After drying, the plants were infested with a 

25 mixed population of spider mites and held at lab temperature and 
humidity for 72 hr. Live mites were then counted to determine 
levels of control. 
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Table 19 

Observed Insecticidal Spectrum of Broths From Different 

Phocorhabdus Strains 



Phocorhabdus Strain 



Sensitive* Insect Species 



WA- 1 


3 * 


* 

r 


4 , 


5, 


6, 


7 , 


8 


WX-2 


2, 


4 












WX - 3 


1, 


4 














1 , 


4 












TilV C 

WX - 5 


4 














WX- 6 


4 














WX- / 


3 , 


4, 


5, 


6, 


n 


8 




WX - 8 


1, 


2, 


4 










WX - 9 


1, 


*> 

*- / 


4 










WX-10 


4 














WX-11 


1. 


2, 


4 










WX- 12 


2 , 


4, 


5, 


6, 


7 , 


8 




WX - 14 


1, 


2, 


4 










WX-15 


1, 


2 , 


4 










W30 


3, 


4, 


5, 


8 








NC- 1 


1 , 


2 , 


3, 


4 , 


5. 


6 , 


7 , 


WIR 


2 , 


3 , 


5, 


6, 


7, 


8 




HP88 


1, 


3, 


4, 


5, 


7 . 


8 




Hb 


3, 


4, 


5, 


7 , 


8 






Hm 


1, 


2, 


3, 


4, 


5, 


7, 


8 


H9 


1, 


2 , 


3, 


4. 


5, 


6, 


7, 


W-14 


1 , 


2 , 


3, 


4, 


5, 


6, 


7, 


ATCC 43948 


4 














ATCC 43949 


4 














ATCC 4 39 50 


4 














ATCC 43951 


4 














ATCC 4 39 52 


4 















10 



= £ 25% mortality and/or growth inhibition vs. control 
= 1; Tobacco budworm, 2; European corn borer, 3; 

Tobacco hornworm, 4; Southern corn rootworm, 5; 

Boll weevil, 5; Mosquito, 7; Fruit Fly, 8 ; 

Aster Leaf hopper, 9; Corn planthopper, 10; 

Two-spotted spider mite. 
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Example 14 
Hon '.-1-14 Phctorhapdus Strains: 
Furif icacion. Characterization and Activity Spectrum 

5 Purification 

The protocol, as follows, is similar to that developed for 
the purification of W-14 and was established based on purifying 
those fractions having the most activity against Southern corn 
root worm (SCR), as determined in bioassays (see Example 13). 

10 Typically, 4-20 L of broth that had been filtered, as described 
in Example 13, were received and concentrated using an Amicon 
spiral ultra filtration cartridge Type S1Y100 attached to an 
Amicon M-12 filtration device. The retentate contained native 
proteins consisting of molecular sizes greater than 100 kDa , 

15 whereas the flow through material contained native proteins less 
than 100 kDa in size. The majority of the activity against SCR 
was contained in the 100 kDa retentate. The retentate was then 
continually diafiltered with 10 mM sodium phosphate (pH = 7.0) 
until the filtrate reached an A280 < 0.100. Unless otherwise 

20 stated, all procedures from this point were performed in buffer 

as defined by 10 mM sodium phosphate (pH 7.0). The retentate was 
then concentrated to a final volume of approximately 0.20 L and 
filtered using a 0.45 mm Nalgene™ Filterware sterile filtration 
unit. The filtered material was loaded at 7.5 ml/min onto a 

25 Pharmacia HR16/10 column which had been packed with PerSeptive 

Biosystem Poros® 50 HQ strong anion exchange matrix equilibrated 
in buffer using a PerSeptive Biosystem Sprint® HPLC system. 
After loading, the column was washed with buffer until an A280 - 
0.100 was achieved. Proteins were then eluted from the column at 

30 2.5 ml/min using buffer with 0.4 M NaCl for 20 min for a total 
volume of 50 ml. The column was then washed using buffer with 
1.0 M NaCl at the same flow rate for an additional 20 min (final 
volume = 50 ml). Proteins eluted with 0.4 M and 1.0 M NaCl were 
placed in separate dialysis bags (Spectra/Por® Membrane MWCO: 

35 2.000) and allowed to dialyze overnight at 4° C in 12 L buffer. 

The majority of the activity against SCR was contained in the 0.4 
M fraction. The 0.4 M fraction was further purified by 
application of 20 ml to a Pharmacia XK 26/100 column that had 
been prepacked with Sepharose CL4B (Pharmacia) using a flow rate 

-94- 



SUBST1TUTE SHEET (RULE 26) 



WO 97/17432 



PCT/US96/18003 



of 0.75 ml/min. Fractions were pooled based cn A280 peak profile 
and concentrated to a final volume of 0.75 mi using a Miilipore 
Ultraf ree®-15 centrifugal filter device Biomax-50K I1MWL membrane. 
Protein concentrations were determined using a Biorad Protein 
5 Assay Kit with bovine gamma globulin as a standard. 

Characterization 

The native molecular weight of the SCR toxin complex was 
determined using a Pharmacia HR 16/50 that had been prepacked 

10 with Sepharose CL4B in buffer. The column was then calibrated 
using proteins of known molecular size thereby allowing for 
calculation of the toxin approximate native molecular size. As 
shown in Table 20, the molecular size of the toxin complex ranged 
from 777 kDa with strain Hb to 1,900 kDa with strain WX-14. The 

15 yield of toxin complex also varied, from strain WX-12 producing 
0.8 mg/L to strain Hb, which produced 7.0 mg/L. 

Proteins found in the toxin complex were examined for 
individual polypeptide size using SDS-PAGE analysis. Typically. 
20 mg protein of the toxin complex from each strain was loaded 

20 onto a 2-15% polyacry lamide gel (Integrated Separation Systems} 
and electrophoresed at 20 mA in Biorad SDS-PAGE buffer. After 
completion of electrophoresis, the gels were stained overnight in 
Biorad Coomassie blue R-2 50 (0.2% in methanol: acetic acid: 
water; 40:10:40 v/v/v) . Subsequently, gels were destained in 

25 methanol : acetic acid: water; 40:10:40 (v/v/v). The gels were 

then rinsed with water for 15 min and scanned using a Molecular 
Dynamics Personal Laser Densitometer®. Lanes were quantitated 
and molecular sizes were calculated as compared to Biorad high 
molecular weight standards, which ranged from 200-45 kDa. 

30 Sizes of the individual polypeptides comprising the SCR 

toxin complex from each strain are listed in Table 21. The sizes 
of the individual polypeptides ranged from 230 kDa with strain 
WX-1 to a size of 16 kDa, as seen with strain WX-7. Every 
strain, with the exception of strain Hb, had polypeptides 

35 comprising the toxin complex that were in the 160-230 kDa range, 
the 100-160 kDa range, and the 50-80 kDa range. These data 
indicate that the toxin complex may vary in peptide composition 
and components from strain to strain, however, in all cases the 



-95- 



SUBSTmiTE SHEET (RULE 26) 



WO 97/17432 PCT/US96/18003 

cc.xin attributes appears to consist of a large, oligomeric 
protein complex. 

Table 20 

5 Characterization of a Toxin Complex From 

Non w-14 Phocorhabdus Strains 



Strain 


Approx . 


Yield 




Native 


Act i ve 




Molecular Wt . a 


Fract ion 
(mg/L) b 


H9 


972 , 000 


1.8 


Hb 


777, 000 


7.0 


Hm 


1, 400, 000 


1.1 


HP88 


813, 000 


2.5 


NCI 


1, 092 , 000 


3.3 


WIR 


979, 000 


1.0 


WX-1 


973,000 


0.8 


WX-2 


951, 000 


2.2 


WX-7 


1,000,000 


1.5 


WX-12 


898, 000 


0.4 


WX-14 


1,900,000 


1.9 


W-14 


860,000 


7 . 5 



a Native molecular weight determined using a Pharmacia HR 

16/50 column packed with Sepharose CL4B 
b Amount of toxin complex recovered from culture broth. 



Activity Spectrum 

10 As shown in Table 21, the toxin complexes purified from 

strains Hm and H9 were tested for activity against a variety of 
insects, with the toxin complex from strain W-14 for comparison. 
The assays were performed as described in Example 13. The toxin 
complex from all three strains exhibited activity against tobacco 

15 bud worm, European corn borer. Southern corn root worm, and aster 
leaf hopper. Furthermore, the toxin complex from strains Hm and 
W-14 also exhibited activity against two-spotted spider mite. in 
addition, the toxin complex from W-14 exhibited activity against 
mosquito larvae. These data indicate that the toxin complex, 

20 while having similarities in activities between certain orders of 
insects, can also exhibit differential activities against other 
orders of insects. 
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Table 21 

The Approximate Sizes < in kDa) of Peptides in a Purified 
Toxin Complex From Non W-14 Phocorhabdus 

5 



H9 


Hb 


Hm 


HP 
88 


NC-1 


WIR 


wx-1 


WX-2 


WX-7 


WX-12 


WX - i 4 


W-l 1 


180 


150 


170 


170 


180 


170 


230 


200 


200 


180 


210 


190 


170 


140 


140 


160 


170 


160 


190 


170 


180 


160 


180 


180 


160 


139 


100 


140 


140 


120 


170 


150 


110 


140 


160 


170 


140 


130 


81 


130 


110 


110 


160 


120 


87 


139 


120 


160 


120 


120 


72 


129 


44 


89 


110 


110 


75 


130 


110 


150 


98 


100 


68 


110 


16 


79 


98 


82 


43 


110 


100 


130 


87 


98 


49 


100 




74 


76 


64 


33 


92 


95 


12 0 


84 


88 


46 


86 




62 


58 


37 


28 


87 


80 


11 -J 


79 


81 


30 


81 




51 


53 


30 


26 


80 


69 


93 


72 


75 


22 


77 




40 


41 




23 


73 


49 


90 


68 


69 


20 


73 




39 


35 




22 


59 


41 


77 


60 


60 


19 


60 




37 


31 




21 


56 


33 


6-;> 


57 


57 




58 




33 


28 




19 


51 




65 


52 


54 




45 




30 


24 




18 


37 




6>. 


46 


49 




39 




28 


22 




16 


33 




60 


40 


44 




35 




27 








32 




51 


37 


39 
37 
35 








25 
23 








26 




4S 
40 

3? 
2 ° 
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Table 22 

Observed Insecticidal Spectrum of a Purified Toxin Complex trcm 

Photorhabdus Strains 



10 



Photorhabdus Strain 



Hm Toxin Complex 
H9 Toxin Complex 
w-14 Toxin Complex 



Sensitive* Insect Species 

1**. 2. 3, 5, 6, 7, 8 

1, 2, 3, 6, 7, 8 

1, 2, 3, 4, 5, 6, 7, 3 



* = > 25% mortality or growth inhibition 

* = > 25% mortality or growth inhibition 

** = 1; Tobacco bud worm, 2; European corn borer, 3; Southern 
corn root worm, 4; Mosquito, 5; Two-spotted spider mite, 
6; Aster Leaf hopper, 7; Fruit Fly, 8; Boll Weevil 



Example 15 

Sub-Fractionation of Photorhabdus Protein Toxin complex 

20 

The Photorhabdus protein toxin complex was isolated as 
described in Example 14. Next, about 10 mg toxin was applied to 
a MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a 
flow rate of lml/min. The column was washed with 20 mM Tris-HCl, 

25 pH 7.0 until the optical density at 280 run returned to baseline 
absorbance. The proteins bound to the column were eluted with a 
linear gradient of 0 to 1.0 M NaCl in 20 mM Tris-HCl, pH 7.0 at I 
ml/'min for 30 min. One ml fractions were collected and subjected 
to Southern corn rootworm (SCR) bioassay (see Example 13). Peaks 

30 of activity were determined by a series of dilutions of each 

fraction in SCR bioassays . Two activity peaks against SCR were 
observed and were named A (eluted at about 0.2-0.3 M NaCl) and B 
(eluted at 0.3-0.4 M NaCl). Activity peaks A and B were pooled 
separately and both peaks were further purified using a 3 -step 

35 procedure described below. 

Solid (NH4>2S04 was added to the above protein fraction to a 
final concentration of 1.7 m. Proteins were then applied to a 
phenyl -Superose 5/5 column equilibrated with 1.7 M (NH4)2S04 in 
50 mM potassium phosphate buffer, pH 7 at 1 ml/min. Proteins 

40 bound to the column were eluted with a linear gradient of 1.7 M 
(NH4)2S04, 0% ethylene glycol, 50 mM potassium phosphate, pH 7 . 0 
to 25% ethylene glycol, 25 mM potassium phosphate, pH 7.0 (no 
(NH4>2S04) at 0.5 ml/min. Fractions were dialyzed overnight 



-98- 



SUBSTITUTE SHEET (RULE 26) 



WO 97/1 7432 PCT/US96/1 8003 

against 10 mM sodium phosphate buffer, pH 7.C. Activities m 

eacli traction against SCR were determined by bioassay. 

The fractions with the highest activity were pooled and 

applied to a MonoQ 5/5 column which was equilibrated with 20 mM 
5 Tris-HCl, pH 7.0 at 1 ml/min. The proteins bound to the column 

were eluted at 1 ml/min by a linear gradient of 0 to 1M MaCl in 

20 mM Tris-HCl, pH 7.0. 

For the final step of purification, the most active 

fractions above (determined by SCR bioassay) were pooled and 
10 subjected to a second phenyl -Superose 5/5/ column. Solid 

(NH4)2S04 was added to a final concentration of 1.7 M. The 

solution was then loaded onto the column equilibrated with 1.7 m 

(NH4J2S04 in 50 mM potassium phosphate buffer, pH 7 at lml/min. 

Proteins bound to the column were eluted with a linear gradient 
15 of 1.7 M (NH4)2S04, 50 mM potassium phosphate, pH 7.0 to 10 mM 

potassium phosphate, pH 7.0 at 0.5 ml/min. Fractions were 

dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. 

Activities in each fraction against SCR were determined by 

bioassay . 

20 The final purified protein by the above 3 -step procedure 

from peak A was named toxin A and the final purified protein from 
peak B was named toxin B. 



Characterization and Amino Acid Sequencing of Toxin A and Toxin B 

25 In SDS-PAGE, both toxin A and toxin B contained two major (> 

90% of total Commassie stained protein) peptides: 192 kDa (named 
Al and Bl, respectively) and 58 kDa (named A2 and B2 , 
respectively) . Both toxin A and toxin B revealed only one major 
band in native PAGE, indicating Al and A2 were subunits of one 

30 protein complex, and Bl and B2 were subunits of one protein 

complex. Further, the native molecular weight of both toxin A 
and toxin B were determined to be 860 kDa by gel filtration 
chromatography. The relative molar concentrations of Al to A2 
was judged to be a 1 to 1 equivalence as determined by 

35 densiometric analysis of SDS-PAGE gels. Similarly, Bl and B2 
peptides were present at the same molar concentration. 

Toxin A and toxin B were electrophoresed in 10% SDS-PAGE and 
transblotted to PVDF membranes. Blots were sent for amino acid 
analysis and N-terminal amino acid s quencing at Harvard 

40 MicroChem and Cambridge ProChem, respectively. The N-terminal 
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amino sequence of Bl was determined to be identical to SEQ ID 
NO:l, the TcbAii region of the tcbA gene (SEQ ID N0:12. position 
37 to 99). A unique N-terminal sequence was obtained for peptide 
B2 (SEQ ID NO:40). The N-terminal amino acid sequence of peptide 
5 B2 was identical to the TcbAiii region of the derived amino acid 
sequence for the tcfcA gene (SEQ ID H0:12, position 1935 to 1945). 
Therefore, the B toxin contained predominantly two peptides, 
TcbAn and TcbAiii, that were observed to be derived from the 
same gene product, TcbA. 
10 The N-terminal sequence of A2 (SEQ ID NO: 41) was unique in 

comparison to the TcbAiii peptide and other peptides. The A2 
peptide was denoted TcdAiii (see Example 17). SEQ ID NO:6 was 
determined to be a mixture of amino acid sequences SEQ ID NO: 40 
and 41. 

15 Peptides Al and A2 were further subjected to internal amino 

acid sequencing. For internal amino acid sequencing, 10 ug of 
toxin A was electrophoresized in 10% SDS-PAGE and transblotted to 
PVDF membrane. After the blot was stained with amide black, 
peptides Al and A2 , denoted TcdAii and TcdAiii, respectively, 

20 were excised from the blot and sent to Harvard MicroChem and 

Cambridge ProChem. Peptides were subjected to trypsin digestion 
followed by HPLC chromatography to separate individual peptides. 
N-terminal amino acid analysis was performed on selected tryptic 
peptide fragments. Two internal amino acid sequences of peptide 

25 Al (TcdAii-PK71, SEQ ID NO:38 and TcdAii-PK44, SEQ ID NO:39) were 
found to have significant homologies with deduced amino acid 
sequences of the TcbAii region of the ccbA gene (SEQ ID 110:12). 
Similarly, the N-terminal sequence (SEQ ID NO:41) and two 
internal sequences of peptides A2 (TcdAiii~PK57 , SEQ ID NO: 42 and 

30 TcdAiii-PK20 , SEQ ID NO. 43) also showed significant homology with 
deduced amino acid sequences of TcbAiii region of the tcbA gene 
(SEQ ID NO: 12 ) . 

In summary of above results, the toxin complex has at least 
two active protein toxin complexes against SCR; toxin A and toxin 
35 B. Toxin A and toxin B are similar in their native and subunits 
molecular weight, however, their peptide compositions are 
different. Toxin A contained peptides TcdAii and TcdAiii as the 
major peptides and the toxin B contains TcbAii and TcbAiii as the 
major peptides. 
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15 



20 



Example 16 
Cleavage and Activation of TcbA Peptide 

5 

In the toxin B complex, peptide TcbAii and TcbAj.ii originate 
from the single gene product TcbA (Example 15). The processing of 
TcbA peptide to TcbAii and TcbAiii is presumably by the action of 
Phocorhabdus protease(s) , and most likely, the metalloproteases 
10 described in Example 10. In some cases, it was noted that when 

Phocorhabdus W-14 broth was processed, TcbA peptide was present in 
toxin B complex as a major component, in addition to peptides 
TcbAii and TcbAiii. Identical procedures, described for the 
purification of toxin B complex (Example 15), were used to enrich 
peptide TcbA from toxin complex fraction of W-14 broth. The final 
purified material was analyzed in a 4-20% gradient SDS-PAGE and 
major peptides were quantified by densitometry. it was determined 
that TcbA, TcbAii and TcbAiii comprised 58%, 36%, and 6%, 
respectively, of total protein. The identities of these peptides 
were confirmed by their respective molecular sizes in SDS-PAGE and 
Western blot analysis using monospecific antibodies. The native 
molecular weight of this fraction was determined to be 860 k.Da . 

The cleavage of TcbA was evaluated by treating the above 
purified material with purified 38 kDa and 58 kDa W-14 
.25 Photorhabdus metalloproteases (Example 10), and Trypsin as a 

control enzyme (Sigma, MO). The standard reaction consisted 17.5 
ug the above purified fraction, 1.5 unit protease, and 0.1 M Tris 
buffer, pH 8.0 in a total volume of 100 ul . For the control 
reaction, protease was omitted. The reaction mixtures were 
30 incubated at 37 °c for 90 min. At the end of the reaction, 20 ul 
was taken and boiled with SDS-PAGE sample buffer immediately for 
electrophoresis analysis in a 4-20% gradient SDS-PAGE. It was 
determined from SDS-PAGE that in both 3 8 kDa and 58 kDa protease 
treatments, the amount of peptides TcbAii and TcbAiii increased 
35 about 3 -fold while the amount of TcbA peptide decreased 
proportionally (Table 23). The relative reduction and 
augmentation of selected peptides was confirmed by Western blot 
analyses. Furthermore, gel filtration of the cleaved material 
revealed that the native molecular size of the complex remained 
40 the same. Upon trypsin treatment, peptides TcbA and TcbAii were 
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nonspecif ically digested into small peptides. This indicated that 
38 kDa and 58 kDa Phocorhabdus proteases can specifically process 
peptide TcbA into peptides TcbAii and TcbAiij.. Protease treated 
and untreated control of the remaining 80 ul reaction mixture were 
5 serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and 
analyzed by SCR bioassay. By comparing activity in several 
dilution, it was determined that the 38 kDa protease treatment 
increased SCR insecticidal activity approximately 3 to 4 fold. 
The growth inhibition of remaining insects in the protease 
10 treatment was also more severe than control (Table 23). 





Table 2 3 




Conversion and activation of 


peptide 


TcbA into peptides TcbAii and 


TcbAni fa y 


protease 


treatment . 




Control 


38 kDa protease treatment 


SO (% of total protein) 


58 


18 


SI (% of total protein) 


36 


64 


S9 (% of total protein) 


6 


18 


LD50 (ug protein) 


2.1 


0.52 


SCR Weight (mg/insect)* 


0.2 


0.1 



*: an indication of growth inhibition by measuring the average 
weight of live insect after 5 days on diet in the assay. 



25 Example 17 

Screening of the library for a gene encoding the T cdAjj Peptide 

The cloning and characterization of a gene encoding the 
TcdAij. peptide, described as SEQ ID NO: 17 (internal peptide 

30 TcdAii-PTlll N- terminal sequence) and SEQ ID NO: 18 (internal 
peptide TcdAii-PT79 N-terminal sequence) was completed. Two 
pools of degenerate oligonucleotides, designed to encode the 
amino acid sequences of SEQ ID NO: 17 (Table 24) and SEQ ID NO: 18 
(Table 25), and the reverse complements of those sequences, were 

35 synthesized as described in Example 8. The DNA sequence of the 
oligonucleotides is given below: 
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Polymerase Chain Reactions (PGR) were performed essentially 
as described in Example 8, using as forward primers P2 . 3 . 5 . CB or 
P2.3.5, and as reverse primers P2.79.R.1 or P2.79R.CB, in ail 
forward/reverse combinations, using Phocorhabdus W-14 genomic DMA 
5 as template. In another set of reactions, primers P2.79.2 or 

P2.7Q.3 were used as forward primers, and P2.3.5R, P2.3.5F.I, and 
P2.3R.CB were used as reverse primers in all forward/ reverse 
combinations. Only in the reactions containing P2.3.6.CB as the 
forward primers combined with P2.79.R.1 or P2.79R.CB as the 

10 reverse primers was a non-artif accual amplified product seen, cf 
estimated size (mobility on agarose gels) of 2500 base pairs. 
The order of the primers used to obtain this amplification 
product indicates that the peptide fragment TcdAii-PTlll lies 
amino-proximal to the peptide fragment TcdAji-PT79. 

15 The 2500 bp PCR products were ligated to the plasmid vector 

pCR^II (Invitrogen, San Diego, CA) according to the supplier's 
instructions, and the DNA sequences across the ends of the insert 
fragments of two isolates (HS24 and HS27) were determined using 
the supplier's recommended primers and the sequencing methods 

20 described previously. The sequence of both isolates was the 
same. New primers were synthesized based on the determined 
sequence, and used to prime additional sequencing reactions to 
obtain a total of 2557 bases of the insert [SEQ ID NO:36], 
Translation of the partial peptide encoded by SEQ ID No: 36 

25 yields the 845 amino acid sequence disclosed as SEQ ID NO: 37. 

Protein homology analysis of this portion of the TcdAii peptide 

fragment reveals substantial amino acid homology (68% similarity; 
53% identity) to residues 542 to 1390 of protein TcbA [SEQ ID 
NO: 12]. It is therefore apparent that the gene represented in 
30 part by SEQ ID NO: 36 produces a protein of similar, but not 

identical, amino acid sequence as the TcbA protein, and which 
likely has similar, but not identical biological activity as the 
TcbA protein. 

In yet another instance, a gene encoding the peptides 
35 TcdAii-PK44 and the TcdA lU 58 kDa N-terminal peptide, described 

as SEQ ID NO:9 (internal peptide TcdA u -PK44 sequence), and SEQ ID 
NO-. 41 (TcdAiii 58 kDa N-terminal peptide sequence) was isolated. 
Two pools of degenerate oligonucleotides, designed to encode the 
amino acid sequences described as SEQ ID NO: 39 (Table 27 i and SEQ 
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ID MO: 41 !Table 26), and the reverse complements of those 

sequences, were synthesized as described in Example 3. and their 
DNA sequences. 
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Polymerase Chain Reactions (PCR) were performed essenciaiiy 
as described in Example 8, using as forward primers Al.44.1 or 
Al.44.2, and reverse primers A2.3R or A2.4R, in all 
forward/reverse combinations, using Phocorhabdus W-14 genomic DNA 
5 as template. In another set of reactions, primers A2 . 1 or A2 . 2 
were used as forward primers, and A1.44.1R, and A1.44.2R were 
used as reverse primers in all forward/ reverse combinations. 
Only in the reactions containing Al.44.1 or Al.44.2 as the 
forward primers combined with A2.3R as the reverse primer was a 

10 non-artif actual amplified product seen, of estimated size 

(mobility on agarose gels) of 1400 base pairs. The order of the 
primers used to obtain this amplification product indicates that 
the peptide fragment TcdAii-PK44 lies amino-proximal to the 58 
kDa peptide fragment of TcdAiii. 

15 The 1400 bp PCR products were ligated to the plasmid vector 

pCR^II according to the supplier's instructions. The DNA 
sequences across the ends of the insert fragments of four 
isolates were determined using primers similar in sequence to the 
supplier's recommended primers and using sequencing methods 

20 described previously. The nucleic acid sequence of all isolates 
differed as expected in the regions corresponding to the 
degenerate primer sequences, but the amino acid sequences deduced 
from these data were the same as the actual amino acid sequences 
for the peptides determined previously, (SEQ ID NOS:41 and 39). 

25 Screening of the W-14 genomic cosmid library as described in 

Example 8 with a radiolabeled probe comprised of the DNA 
prepared above (SEQ ID NO: 36) identified five hybridizing cosmid 
isolates, namely 17D9, 20B10, 21D2, 27B10, and 26D1. These 
cosmids were distinct from those previously identified with 

30 probes corresponding to the genes described as SEQ ID NO: 11 or 
SEQ ID NO:25. Restriction enzyme analysis and DNA blot 
hybridizations identified three EcoR I fragments, of approximate 
sizes 3.7, 3.7, and 1.1 kbp, that span the region comprising the 
DNA of SEQ ID NO: 36. Screening of the W-14 genomic cosmid 

35 library using as probe the radiolabeled 1.4 kbp DNA fragment 

prepared in this example identified the same five cosmids (17D9, 
20B10, 21D2, 27B10, and 26D1). DNA blot hybridization to EcoR I- 
digested cosmid DNAs also showed hybridization to the same subset 
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of EcoR I fragments as seen with the 2.5 kbp TcdAii gene probe, 
indicating that both fragments are encoded on the genomic DNA. 

DNA sequence determination of the cloned EcoR I fragments 
revealed an uninterrupted reading frame of 7551 base pairs ( SEQ 
5 ID NO-.46), encoding a 232.9 kDa protein of 2516 amino acids (SEQ 
ID NO:47). Analysis of the amino acid sequence of this protein 
revealed all expected internal fragments of peptides TcdAii (SEQ 
ID NOS:17, 18, 37, 38 and 39) and the TcdAiii peptide N-terminus 
(SEQ ID N0:41) and all TcdAiii internal peptides (SEQ ID NOS:42 

10 and 43). The peptides isolated and identified as TcdAii and 
TcdAiii a ^e each products of the open reading frame, denoted 
tcdA, disclosed as SEQ ID NO.-46. Further, SEQ ID NO:47 shows, 
starting at position 89, the sequence disclosed as SEQ ID NO: 13, 
which is the N-terminal sequence of a peptide of size 

15 approximately 201 JcDa, indicating that the initial protein 

produced from SEQ ID No: 46 is processed in a manner similar to 
that previously disclosed for SEQ ID NO: 12. In addition, the 
protein is further cleaved to generate a product of size 209.2 
kDa, encoded by SEQ ID NO: 48 and disclosed as SEQ ID NO: 4 9 

20 (TcdAii peptide), and a product of size 63.6 JcDa, encoded by SEQ 
ID NO:50 and disclosed as SEQ ID NO:51 (TcdAiii peptide). Thus, 
it is thought that the insecticidal activity identified as toxin 
A (Example 15) derived from the products of SEQ ID MO: 46, as 
exemplified by the full-length protein of 282.9 JcDa disclosed as 

25 SEQ ID NO: 47, is processed to produce the peptides disclosed as 
SEQ ID NOS:49 and 51. It is thought that the insecticidal 
activity identified as toxin B (Example 15) derives from the 
products of SEQ ID NO: 11, as exemplified by the 280.5 kDa protein 
disclosed as SEQ ID NO: 12. This protein is proteolytical ly 

30 processed to yield the 207.6 kDa peptide disclosed as SEQ ID 

NO: 53, which is encoded by SEQ ID NO: 52, and the 62.9 kDa peptide 
having N-terminal sequence disclosed as SEQ ID NO: 40, and further 
disclosed as SEQ ID NO: 55, which is encoded by SEQ ID MO: 54. 
Amino acid sequence comparisons between the proteins 

35 disclosed as SEQ ID NO: 12 and SEQ ID NO: 47 reveal that they have 
69% similarity and 54% identity. This high degree of 
evolutionary relationship is not uniform throughout the entire 
amino acid sequence of these peptides, but is higher towards the 
carboxy-terminal end of the proteins, since the peptides 
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disclosed as SEQ ID NO: 51 (derived from SEQ ID NO: 47) and SEQ ID 
NO: 55 (derived from SEQ ID NO: 12) have 76% similarity and 64% 
identity. 

5 

Example 18 

Control of European Cornborer- Induced Leaf Damage on Maine Planes 
by Spray Application of Photorhabdus (Strain W-14) Broth 

10 The ability of Photorhabdus toxin (s) to reduce plant damage 

caused by insect larvae was demonstrated by measuring leaf damage 
caused by European corn borer (Ostrinia nubilalis) infested onto 
maize plants treated with Photorhabdus broth. Fermentation broth 
from Photorhabdus strain W-14 was produced and concentrated 

15 approximately 10-fold using ultrafiltration (10.000 MW pore-size) 
as described in Example 13. The resulting concentrated broth was 
then filter sterilized using 0.2 micron nitrocellulose membrane 
filters. A similarly prepared sample of uninoculated 2% proteose 
peptone #3 was used for control purposes. Maize plants (a 

20 DowElanco proprietary inbred line) were grown from seed to 

vegetative stage 7 or 8 in pots containing a soilless mixture in 
a greenhouse (27°C day; 22°C night, about 50%RH, 14 hr day- 
length, watered/fertilized as needed) . The test plants were 
arranged in a randomized complete block design (3 reps /treatment , 

25 6 plants/treatment) in a greenhouse with temperature about 22°C 
day; 18°C night, no artificial light and with partial shading, 
about 50%RH and watered/fertilized as needed. Treatments 
(uninoculated media and concentrated Photorhabdus broth) were 
applied with a syringe sprayer, 2.0 mis applied from directly 

30 (about 6 inches) over the whorl and 2.0 additional mis applied in 
a circular motion from approximately one foot above the whorl. 
In addition, one group of plants received no treatment. After 
the treatments had dried (approximately 30 minutes), twelve 
neonate European corn borer larvae (eggs obtained from commercial 

35 sources and hatched in-house) were applied directly to the whorl. 
After one week, the plants were scored for damage to the leaves 
using a modified Guthrie Scale (Koziel, M. G. , Beland, G. L., 
Bowman, c. Carozzi, N. B. , Crenshaw, R. , Crossland, L . , Dawson, 
J., Desai, M . , Hill, M. , Kadwell, S., Launis, K. , Lewis, K. , 

40 Maddox, D. , McPherson, K . , Meghji, M. Z., Merlin, E. , Rhodes, R . , 
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Warren, G. W. , Wright, M . and Evoia, S. V. 1993). 

Bio/Technology, 11. 194-195.) and the scores were compared 
statistically [T-test (LSD) p<0.05 and Tukey ' s Studentized Range 
(HSD) Test p<0.1]. The results are shown in Table 23. For 
5 reference, a score of 1 represents no damage, a score of 2 

represents fine "window pane" damage on the unfurled leaf with no 
pinhole penetration and a score of 5 represents leaf penetration 
with elongated lesions and/or mid rib feeding evident on more 
than three leaves (lesions < 1 inch) . These data indicate that 
10 broth or other protein containing fractions may confer protection 
against specific insect pests when delivered in a sprayable 
formulation or when the gene or derivative thereof, encoding the 
protein or part thereof, is delivered via a transgenic plant or 
microbe . 

15 

Table 28 

Effect of Photorhabdus Culture Broth on 
European Corn Borer- Induced Leaf Damage on Maize 

20 Treatment Average Guthrie Score 

No Treatment 5.02 a 

Uninoculated medium 5.15 a 

Photor77ajbdus Broth 2.24 b 
Means with different letters are statistically different 
25 (p<0.05 or p<0.1) . 

Example 19 

Genetic Engineering of Genes for Expression in E. coli 

30 Summary of constructions 

A series of plasmids were constructed to express the ccbA 
gene of Photorhabdus W-14 in Escherichia coli. A list of the 
plasmids is shown in Table 29. A brief description of each 
construction follows as well as a summary of the E. coli 

35 expression data obtained. 
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Table 2 9 

Expression plasmids for the ccbA gene, 



Plasmid 


Gene 


Vector/ Select ion 


Compartment 










PDAB63 4 


ccbA 


pBC/Chl 


Intracellular 


PACGP67B/ CcbA 


ccbA 


pAcGP67B/Amp 


Baculo virus , 
secreted 


PDAB63 5 


tcbA 


pET27b/Kan 


Periplasm 


pET15 - ccbA 

ihhroi/i at* i nne • L 


ccbA 

' a n » 1/a n a mi r /— 


PET15- CcbA 


Intracellular 



Construction of pDAB634 

In Example 9, a large EcoR I fragment which hybridizes to 
the TcbAn probe is described. This fragment was subcloned into 
pBC (Stratagene, La Jolla CA) . Sequence analysis indicates that 
10 this fragment is 8816 base pairs. The fragment encodes the ccbA 
gene with the initiating ATG at position 571 and the terminating 
TAA at position 8086. The fragment therefore carries 570 base 
pairs of Phocorhabdus DNA upstream of the ATG and 730 base pairs 
downstream of the TAA. 

15 

Construction of Plasmid pAcGP67B/ ccbA 

The ccbA gene was PCR amplified using the following primers; 
5' primer (SlAcSl) 5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA 
TC 3' and 3' primer (SlAc31) 5' TTT AAA GCG GCC GCT TAA CGG ATG 

20 GTA TAA CGA ATA TG 3 ' . PCR was performed using a TaKaRa LA PCR 
kit from PanVera (Madison, Wisconsin) in the following reaction: 
57.5 ml water, 10 ml 10X LA buffer, 16 ml dNTPs (2.5 mM each 
stock solution), 20 ml each primer at 10 pmoles/ml, 300 ng of the 
plasmid pDAB634 containing the W-14 ccbA gene and one ml of 

25 TaKaRa LA Taq polymerase. The cycling conditions were 98°C/20 

sec, 68°C/5 min, 72°C/10 min for 30 cycles. A PCR product of the 
expected about 7526bp was isolated in a 0.8% agarose gel in TBE 
(100 mM Tris, 90 mM boric acid, 1 mM EDTA) buffer and purified 
using a Qiaex II kit from Qiagen (Chatsworth, California) . The 

30 purified tcbA gene was digested with Nco I and Not I and ligated 
into the baculovirus transfer vector pAcGP67B (PharMingen (San 
Diego, California)) and transformed into DH5a E. coli. The tcbA 
gene was then cut from pAcGP67B and transferred to pET27b to 
create plasmid pDAB635. A missense mutation in the tcbA gene was 

35 repaired in pDAB63 5. 
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The repaired ccbA gene contains two changes from the 
sequence shown in Sequence ID NO: 11; an A>G at 212 changing an 
asparagine 71 to serine 71 and a G.-A at 229 changing an alanine 
77 to threonine 77. These changes are both upstream of the 
5 proposed TcbAij. N-terminus. 



Construction of pETlS-tcb-A 

The ccbA coding region of pDAB635 was transferred to vector 
pET15b. This was accomplished using shotgun ligations, the DMAs 
10 were cut with restriction enzymes Nco I and Xho I. The resulting 
recombinant is called pET15-ccbA. 

Expression of TcbA in E. coli from plasmid pET15-ccfc>A 

Expression of ccbA in E. coli was obtained by modification 
15 of the methods previously described by Studier ec al. (Studier, 
F.W., Rosenberg, A., Dunn, J., and Dubendorff, j., (1990) use of 
T7 rna polymerase to direct expression of cloned genes. Methods 
Enzymol.. 185: 60-89.). Competent £. coli cells strain BL2KDE3) 
were transformed with plasmid pETlS-ccbA and plated on LB agar 

20 containing 100 p.g/ml ampicillin and 40 mM glucose. The 

transformed cells were plated to a density of several hundred 
isolated colonies/plate. Following overnight incubation at 37°C 
the cells were scraped from the plates and suspended in LB broth 
containing 100 \ig /ml ampicillin. Typical culture volumes were 

25 from 200-500 ml. At time zero, culture densities (OD600) were 

from 0.05-0.15 depending on the experiment. Cultures were shaken 
at one of three temperatures (22°C, 30°C or 37°C) until a density 
of 0.15-0.5 was obtained at which time they were induced with 1 
mM isopropylthio-p-galactoside (IPTG). Cultures were incubated 

30 at the designated temperature for 4-5 hours and then were 
transferred to 4°C until processing (12-72 hours). 



Purification and characterization of TcbA expressed in E.coli 
from Plasmid pET15-ccM. 

E. coli cultures expressing TcbA peptides were processed as 
follows. Cells were harvested by centrif ugation at 17,000 x G and 
the media was decanted and saved in a separate container. 

The media was concentrated about 8x using the M12 (Amicon. 
Beverly MA) filtration system and a 100 kD molecular mass cut-off 
filter. The concentrated media was loaded onto an anion exchange 
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column and the bound proteins were eluted with 1.0 M HaCl. The 
1.0 M MaCl elution peak was found to cause mortality against 
Southern corn rootworm (SCR) larvae Table 30). The 1.0 h Naci 
fraction was dialyzed against 10 mM sodium phosphate buffer pH 

5 7.o, concentrated, and subjected to gel filtration on Sepharose 
CL-4B (Pharmacia, Piscataway, Mew Jersey). The region of the CL- 
4B elution profile corresponding to calculated molecular weight 
(about 900 kDa) as the native W-14 toxin complex was collected, 
concentrated and bioassayed against larvae. The collected 900 

0 kDa fraction was found to have insecticidal activity (see Table 
30 below) , with symptomology similar to that caused by native w- 
14 toxin complex. This fraction was subjected to Proteinase K 
and heat treatment, the activity in both cases was either 
eliminated or reduced, providing evidence that the activity is 

5 proteinaceous in nature. In addition, the active fraction tested 
immunologically positive for the TcbA and TcbAiii peptides in 
immunoblot analysis when tested with an anti-TcbAiii monoclonal 
antibody (Table 30). 

Table 3 0 



Results of Immunoblot and SCR Bioassays. 



Fraction 


SCR Activity 


Immunoblot 


Native Size 




% 

Mortality 


% Growth 
Inhibit . 


Peptides 
Detected 


(CL-4B 
Est imated 
Size] 


TcbA Media 1 . 0 M 


+ + + 


+■++ 


TcbA 




Ion Exchange 










TcbA Media CL-4B 


++ + 


♦+ + 


TcbA, 
TcbAiii 


-900 kDa 


TcbA Media CL-4B 
+ Proteinase K 


+ + 


+++ 


NT 




TcbA Media CL-4B 
+ heat treatment 






NT 
















i ' .* > - 








TcbA Cell Sup CL-4B 




+++ 


NT 


-900 kD 













PK = Proteinase K treatment 2 hours; Heat treatment = 100°C for 10 
minutes; ND = None Detected; NT = Not Tested. Scoring system for 
mortality and growth inhibition as compared to control samples; 5- 
24%="+", 25-49%="++", 50-100%="+++". 



The cell pellet was resuspended in 10 mM sodium phosphate 
buffer, pH=7.0, and lysed by passage through a Bio-Neb m cell 
nebulizer (Glas-Col Inc., Terra Haute, IN). The pellets were 
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created with DNase to remove DMA and centnfuged at 17,000 x g ko 
separate the cell pellet from the cell supernatant. The 
supernatant fraction was decanted and filtered through a 0.2 
micron filter to remove large particles and subjected to anion 
5 exchange chromatography. Bound proteins were eluted with 1.0 M 
NaCl, dialyzed and concentrated using Biomax'" (Millipore Corp, 
Bedford, MA) concentrators with a molecular mass cut-off of 
50,000 Daltons. The concentrated fraction was subjected to gel 
filtration chromatography using Sepharose CL-4B beaded matrix, 

10 Bioassay data for material prepared in this way is shown in Table 
30 and is denoted as " TcbA Cell Sup". 

In yet another method to handle large amounts of material, 
the cell pellets were re-suspended in 10 mM sodium phosphate 
buffer, pH = 7.0 and thoroughly homogenized by using a Kontes 

15 Glass Company (Vineland, NJ) 40 ml tissue grinder. The cellular 
debris was pelleted by centrif ugation at 25,000 x g and the cell 
supernatant was decanted, passed through a 0.2 micron filter and 
subjected to anion exchange chromatography using a Pharmacia 
10/10 column packed with Poros HQ 50 beads. The bound proteins 

20 were eluted by performing a NaCl gradient of 0.0 to 1.0 M. 
Fractions containing the TcbA protein were combined and 
concentrated using a 50 kDa concentrator and subjected to gel 
filtration chromatography using Pharmacia CL-4B beaded matrix. 
The fractions containing TcbA oligomer, molecular mass of 

25 approximately 900 kDa, were collected and subjected to anion 
exchange chromatography using a Pharmacia Mono Q 10/10 column 
equilibrated with 20 mM Tris buffer pH = 7 . 3 . A gradient of 0.0 
to 1.0 M NaCl was used to eluce recombinant TcbA protein. 
Recombinant TcbA eluted from the column at a salt concentration 

30 of approximately 0.3-0.4 M NaCl, the same molarity at which 

native TcbA oligomer is eluted from the Mono Q 10/10 column. The 
recombinant TcbA fraction was found to cause SCR mortality in 
bioassay experiments similar to those in Table 30. 
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SEQUENCE listiii: 



ill GENERAL INFORMATION: 

5 

li) APPLICANT: Ensign, Jerald C 

Bowen , David J 

Petell, James 

Fatig, P.aymond 
HI Schoonover. Sue 

f f rench-Constant , Richard 

Orr. Gregory L 

Merlo, Donald J 

Roberts, Jean L 
15 Rocheleau, Thomas A 

Blackburn, Michael 3 

Hey, Timothy D 

Strickland, James A 

20 (ii) TITLE OF INVENTION: Insect icidal Protein Toxins From 

Phocorhabdus 

fiiiJ NUMBER OF SEQUENCES: 61 

25 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Quarles & Brady 

(B) STREET: 1 South Pinckney Street 

(C) CITY: Madison 

(D) STATE: WI 
30 IE) COUNTRY: US 

(F) ZIP: 53703 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

35 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS / MS - DOS 

(D) SOFTWARE: Patentln Release #1.0. Version *1.30 

(vi) CURRENT APPLICATION DATA: 
40 (A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 
45 (A) APPLICATION NUMBER: US 08/063,615 

(B) FILING DATE: 18 -MAY- 199 3 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08 '3 95,497 
50 (B) FILING DATE: 28-FEB-1995 

(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/007,255 

(B) FILING DATE: 06-NOV-1995 



55 



'vii! PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/608,423 

(B) FILING DATE: 23-FEB-1996 
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15 



30 



P P.I OP. APPLICATION DATA: 
'A) APPLICATION NUMBER: US 03/70 5,484 
(B) FILING DATE : 23-AUG-1996 



(viii) ATTORNEY /AGENT INFORMATION: 

i. A) NAME: Seay . Nicholas J 

(B) REGISTRATION NUMBER: 2"?3 8 6 

(C) REFERENCE /DOCKET NUMBER: 960296.93804 

(IX) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 608-251-5000 

(B) TELEFAX: 608-251-9166 



2) INFORMATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

25 

(v) FRAGMENT TYPE: N- terminal 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn 

1 5 10 



35 (2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
40 (C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
45 (v) FRAGMENT TYPE: N-terminal 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

50 Met Gin Asp Ser Pro Glu 7a 1 Ser lie Thr Thr Trp 

1 5 10 



(2) INFORMATION FOR SEQ ID NO: 3: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 
(B> TYPE: amino acid 
(C) STRANDEDNESS: 
60 (D) TOPOLOGY: linear 
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(iii MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala 
15 10 15 

Leu Val Ala 



15 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
20 (C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
25 (v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

30 Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn 

15 10 



INFORMATION FOR SEQ ID NO : 5 : 

U) SEQUENCE CHARACTERISTICS: 
<A) LENGTH : 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 
45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
Ala Gly Asp Thr Ala Asn lie Gly Asp 

50 l 5 



(2) INFORMATION FOR SEQ ID NO: 6: 

55 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

60 

(ii) MOLECULE TYPE: protein 
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20 



25 



50 



60 



FRAGMENT TYPE: H- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Leu Gly Gly Ala Ala Thr Leu Leu Asp Leu Leu Leu Pro Gin lie 
15 10 



12) INFORMATION FOR SEQ ID NO:7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
15 (B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 
1 5 10 



30 (2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 amino acids 
t'B) TYPE: amino acid 
35 ( C ) STRANDEDNESS : 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

40 (v) FRAGMENT TYPE: N-terminal 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

45 Met Asn Leu Ala Ser Pro Leu lis Ser 

1 5 



(2) INFORMATION FOR . SEQ ID NO: 9: 



(i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS : 

55 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: N-terminal 
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'Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met lie Asn Leu Asp lie Asn Glu Gin Asn Lys He Met Val Val Sc r 

(2) INFORMATION FOR SEQ ID NO: 10: 

!i> SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH: 20 amino acids 

(B; TYPE: amino acid 
!C) STRANDEDNESS: 
(Di TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: N- terminal 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Ala Ala Lys Asp Val Lys Phe Gly Ser Asp Ala Arg Val Lys Met Leu 



30 



15 10 



15 



25 Arg Gly Val Asn 

20 



'.2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 
40 (A) NAME /KEY: CDS 

(B) LOCATION: 1..7 515 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

45 

ATG CAA AAC TCA TTA TCA AGC ACT ATC GAT ACT ATT TGT CAG AAA CTG 

Met Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin Lys Leu 
1 5 10 15 

50 CAA TTA ACT TGT CCG GCG GAA ATT GCT TTG TAT CCC TTT GAT ACT TTC 

Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 
20 25 30 



43 



96 



CGG GAA AAA ACT CGG GGA ATG GTT AAT TGG GGG GAA GCA AAA CGG ATT 144 
55 Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg 11° 
35 40 45 

TAT GAA ATT GCA CAA GCG GAA CAG GAT AGA AAC CTA CTT CAT GAA AAA 19 2 
Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
60 50 55 60 

CGT ATT TTT GCC TAT GCT AAT CCG CTG CTG AAA AAC GCT GTT CGG TTG 240 
Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
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65 "0 "5 30 

GGT ACC CGG CAA ATG TTG GGT TTT ATA CAA GGT TAT AGT GAT CTG TTT 23 3 
Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 
5 35 90 95 

GGT AAT CGT GCT GAT AAC TAT GCC GCG CGG GGC TCG GTT GCA TCG ATG 3 36 

Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 
100 105 HO 

10 

TTC TCA CCG GCG GCT TAT TTG ACG GAA TTG TAC CGT GAA GCC AAA AAC 3 34 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 
115 120 125 

15 TTG CAT GAC AGC AGC TCA ATT TAT TAC CTA GAT AAA CGT CGC CCG GAT 4 32 
Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 
130 135 140 

TTA GCA AGC TTA ATG CTC AGC CAG AAA AAT ATG GAT GAG GAA ATT TCA 4 30 
20 Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 
145 150 155 150 

ACG CTG GCT CTC TCT AAT GAA TTG TGC CTT GCC GGG ATC GAA AC A AAA 52 3 
Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 
25 165 170 175 

ACA GGA AAA TCA CAA GAT GAA GTG ATG GAT ATG TTG TCA ACT TAT CGT 57 6 

Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 

180 185 190 

30 

TTA AGT GGA GAG ACA CCT TAT CAT CAC GCT TAT GAA ACT GTT CGT GAA 624 

Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arc Glu 

195 200 205 

35 ATC GTT CAT GAA CGT GAT CCA GGA TTT CGT CAT TTG TCA CAG GCA CCC 67 2 
He Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 
210 215 220 

ATT GTT GCT GCT AAG CTC GAT CCT GTG ACT TTG TTG GGT ATT AGC TCC 729 
40 He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser 
225 230 235 240 

CAT ATT TCG CCA GAA CTG TAT AAC TTG CTG ATT GAG GAG ATC CCG GAA 7 63 
His He Ser Pro Glu Leu Tyr Asn Leu Leu lie Glu Glu He Pro Glu 
45 245 250 255 

AAA GAT GAA GCC GCG CTT GAT ACG CTT TAT AAA ACA AAC TTT GGC GAT 316 

Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 
260 265 270 

50 

ATT ACT ACT GCT CAG TTA ATG TCC CCA AGT TAT CTG GCC CGG TAT TAT 364 

He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 
2~5 280 235 

55 GGC GTC TCA CCG GAA GAT ATT GCC TAC GTG ACG ACT TCA TTA TCA CAT 912 
Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 
290 295 300 

GTT GGA TAT AGC AGT GAT ATT CTG GTT ATT CCG TTG GTC GAT GGT GTG 960 
60 Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 
305 310 315 320 

GGT AAG ATG GAA GTA GTT CGT GTT ACC CGA ACA CCA TCG GAT AAT TAT 10 03 
Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 
65 325 330 335 
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ACC AGT 
Thr Ser 



5 TAT TTG 
Tyr Leu 



TAT CTG 
10 Tyr Leu 

370 

AAT CCC 
Asn Pro 
15 385 

AC A ATC 
Thr He 

20 

AGA TGG 
Arg Trp 



25 GAC CAA 
Asp Gin 



CGG TTG 
30 Arg Leu 
450 

GTT GAT 
Val Asp 
35 465 

AAG GTT 
Lys Val 

40 

GAG ACA 
Glu Thr 



45 GGC AAT 
Gly Asn 



AAT GGT 
50 Asn Gly 
530 

AAT CCT 
Asn Pro 
55 545 

AAG GCG 
Lys Ala 

60 

CAG ATG 
Gin Met 



65 A.AC TTA 
Asn Leu 



CAG ACG AAT 
Gin Thr Asn 
34C 

ATC AAA TAC 
He Lys Tyr 
355 

CAA TAT AAA 
Gin Tyr Lys 



TAT CCT GAT 
Tyr Pro Asp 



AAA CGT AGT 
Lys Arg Ser 
405 

CAT AGC GGT 
His ser Gly 
420 

TAC TCC CCG 
Tyr Ser Pro 
435 

CTC AAA GCT 
Leu Lys Ala 



AGT GTT AAT 
Ser Val Asn 



TAT CGG GTA 
Tyr Arg Val 
485 

GCC GCT ATT 
Ala Ala lie 
500 

CAG CTT AGC 
Gin Leu Ser 
515 

ATT CGC TAT 
lie Arg Tyr 



GAT CTG AAC 
Asp Leu Asn 



GTT TTA AAA 

Val Leu Lys 
565 

TTA TTG ATC 

Leu Leu He 
580 

GAG AAT TTG 

Glu Asn Leu 



TAT ATT GAG 
Tyr lie Glu 



AAT CTA AGC 
Asn Leu ser 
360 

GAT GGT TCC 
Asp Gly Ser 
375 

ATG GTC ATA 
Met Val He 
390 

GAC TCT GAC 
Asp Ser Asp 



AGT TAT AAT 
Ser Tyr Asn 



AAA GCT TTC 
Lys Ala Phe 
440 

ACC GGC CTC 
Thr Gly Leu 
455 

AGC ACC AAA 
Ser Thr Lys 
470 

AAA TTC TAT 
Lys Phe Tyr 



TTG GCT AAT 
Leu Ala Asn 



CAG TTT GAG 
Gin Phe Glu 
520 

GAA ATC AGT 
Glu He Ser 
535 

CTT AAA CCA 
Leu Lys Pro 
550 

CGC GCG TTT 
Arg Ala Phe 



ACT GAT CGT 
Thr Asp Arg 



TCT GAT CTG 
Ser Asp Leu 



CTG TAT CCA 
Leu T/r Pro 
345 

AAT AGT TTT 
Asn Ser Phe 



GCT GAT TGG 
Ala Asp Trp 



AAT CAA AAG 
Asn Gin Lys 
395 

AAT ATA CTC 
Asn lift Leu 
410 

TTT GCC GCC 
Phe Ala Aia 
425 

CTG CTT AAA 
Leu Leu Lys 



TCT TTT GCT 
Ser Phe Ala 



TCC ATC ACG 
Ser He Thr 
475 

ATT GAT CGT 
He Asp Arg 
490 

ATT AAT ATC 
He Asn He 
505 

CAA CTA TTT 
Gin Leu Phe 



GAG GAC AAC 
Glu Asp Asn 



GAC AGT ACC 
Asp Ser Thr 
555 

CAG GTT AAC 
Gin Val Asn 
570 

AAA GAA GAC 
Lys Glu Asp 
585 

TAT TTG GTT 
Tyr Leu Val 
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CAG GGT GGC 
Gin Gly Gly 
350 

GGT TTG GAT 
Gly Leu Asp 
365 

ACT GAG ATT 
Thr Glu lie 
380 

TAT GAA TCA 
Tyr Glu Ser 



AGT ATA GGG 
Ser He Gly 



GCC AAT TTT 
Ala Asn Phe 
430 

ATG AAT AAG 
Met Asn Lys 
445 

ACG TTG GAG 
Thr Leu Glu 
460 

GTT GAG GTA 
Val Glu Val 



TAT GGC ATC 
Tyr Gly He 



TCT CAG CAA 
Ser Gin Gin 
510 

AAT CAC CCG 
Asn His Pro 
525 

TCC AAA CAT 

Ser Lys His 
540 

GGT GAT GAT 

Gly Asp Asp 



GCC AGT GAG 
Ala Ser Glu 



GGT GTT ATC 
Gly Val He 
590 

AGT TTG CTG 
Ser Leu Leu 



GAC AAT 1C56 
Asp Asn 



GAT TTT 1104 
Asp Phe 



GCC CAT 1152 
Ala Kis 



CAG GCG 1200 
Gin Ala 
400 

TTA CAA 1243 
Leu Gin 
415 

AAA ATT 1296 
Lys He 



GCT ATT 1344 
Aia He 



CGT ATT 13 9 2 
Arg He 



TTA AAC 1440 
Leu Asn 
430 

AGT GAA 14 83 
Ser Glu 
495 

GCT GTT 153 6 
Ala Val 



CCG CTC 1584 
Pro Leu 



CTT CCT 1632 
Leu Pro 



CAA CGC 1580 
Gin Arg 
560 

TTG TAT 1723 
Leu Tyr 
575 

AAA AAT 1776 
Lys Asn 



GCC CAG 1824 
Ala Gin 
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10 



30 



50 



55 



595 500 605 

ATT CAT AAC CTG ACT ATT GCT GAA TTG AAC ATT TTG TTG GTG ATT TGT 13 
He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val He Cys 
610 615 620 

GGC TAT GGC GAC ACC AAC ATT TAT CAG ATT ACC GAC GAT AAT TTA GCC 192 0 
Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn Leu Ala 
o2S 630 635 $40 

AAA ATA GTG GAA AC A TTG TTG TGG ATC ACT CAA TGG TTG AAG ACC CAA 1963 
Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin 
645 650 655 

15 AAA TGG AC A GTT ACC GAC CTG TTT CTG ATG ACC ACG GCC ACT TAC AGC 2016 
Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 

660 665 670 

ACC ACT TTA ACG CCA GAA ATT AGC AAT CTG ACG GCT ACG TTG TCT TC} 2 064 
20 Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 
o~S 680 685 

ACT TTG CAT GGC AAA GAG AGT CTG ATT GGG GAA GAT CTG AAA AG A GCA 2112 
Thr Leu His Gly Lys Glu Ser Leu lie Gly Glu Asp Leu Lys Arg Ala 
25 690 695 700 



ATG GCG CCT TGC TTC ACT TCG GCT TTG CAT TTG ACT TCT CAA GAA GTT 2150 

Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 
705 710 715 720 

GCG TAT GAC CTG CTG TTG TGG ATA GAC CAG ATT CAA CCG GCA CAA ATA 2208 

Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala Gin He 
"25 730 735 

35 ACT GTT GAT GGG TTT TGG GAA GAA GTG CAA ACA ACA CCA ACC AGC TTG 22 56 

Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 

740 745 750 

AAG GTG ATT ACC TTT GCT CAG GTG CTG GCA CAA TTG AGC CTG ATC TAT 23 04 

40 Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 

755 760 765 

CGT CGT ATT GGG TTA AGT GAA ACG GAA CTG TCA CTG ATC GTG ACT CAA 23 52 

Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 

45 770 775 730 



TCT TCT CTG CTA GTG GCA GGC AAA AGC ATA CTG GAT CAC GGT CTG TTA 24 00 

Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 

? 35 790 795 800 

ACC CTG ATG GCC TTG GAA GGT TTT CAT ACC TGG GTT AAT GGC TTG GGG 24 4 3 

Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 

805 810 815 

CAA CAT GCC TCC TTG ATA TTG GCG GCG TTG AAA GAC GGA GCC TTG ACA 24 9 6 

Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 

820 825 830 



GTT ACC GAT GTA GCA CAA GCT ATG AAT AAG GAG GAA TCT CTC CTA CAA 25 4 4 
60 Val Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 
835 840 845 

ATG GCA GCT AAT CAG GTG GAG AAG GAT CTA ACA AAA CTG ACC AGT TGG 25 92 
Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 
"5 350 855 360 
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20 



AC A CAG ATT GAC GCT ATT CTG CAA TGG TTA CAG ATG TCT TCG GCC TTG 2 640 
Thr Gin He Asp Ala I la Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 
865 370 875 380 

GCG GTT TCT CCA CTG GAT CTG GCA GGG ATG ATG GCC CTG AAA TAT GGG 2 683 
Ala Val Ser Pro Leu Asp Leu Ala Cly Met Met Ala Leu Lys Tyr Gly 
885 390 395 

ATA GAT CAT AAC TAT GCT GCC TGG CAA GCT GCG GCG GCT GCG CTG ATG 27 36 
lie Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Aia Ala Leu Met 
300 905 9i0 

GCT GAT CAT GCT AAT CAG GCA CAG AAA AAA CTG GAT GAG ACG TTC ACT 2 7 34 
Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 
915 920 925 

AAG GCA TTA TGT AAC TAT TAT ATT AAT GCT GTT STC GAT AGT GCT GCT 2 8 32 
Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 
930 935 940 

GGA GTA CGT GAT CGT AAC GGT TTA TAT ACC TAT TTG CTG ATT GAT AAT 2 330 
Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 
945 950 355 960 

25 CAG GTT TCT GCC GAT GTG ATC ACT TCA CGT ATT GCA GAA GCT ATC GCC 2928 
Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala He Ala 
965 970 975 

GGT ATT CAA CTG TAC GTT AAC CGG GCT TTA AAC CGA GAT GAA GGT CAG 2976 
30 Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 
980 985 990 

CTT GCA TCG GAC GTT AGT ACC CGT CAG TTC TTC ACT GAC TGG GAA CGT 3 024 
Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
35 995 1000 1005 

TAC AAT AAA CGT TAC AGT ACT TGG GCT GGT CTC TCT GAA CTG GTC TAT 3 072 

Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 

1010 1015 1020 

40 

TAT CCA GAA AAC TAT GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC AAA 3120 

Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 

1025 1030 1035 1040 

45 ATG ATG GAT GCG CTG TTG CAA TCC ATC AAC CAG AGC CAG CTA AAT GCG 3153 
Met Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 
1045 1050 1055 

GAT ACG GTG GAA GAT GCT TTC AAA ACT TAT TTG ACC AGC TTT GAG CAG 3 216 
50 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 
1060 1065 1070 

GTA GCA AAT CTG AAA GTA ATT A.GT GCT TAC CAC GAT AAT GTG AAT GTG 3 2 64 
Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 
55 1075 1080 1085 

GAT CAA GGA TTA ACT TAT TTT ATC GGT ATC GAC CAA GCA GCT CCG GGT 3 312 
Asp Gin Gly Leu Thr Tyr Phe lie Gly He Asp Gin Ala Ala Pro Gly 
1090 1095 1100 



60 



ACG TAT TAC TGG CGT AGT GTT GAT CAC AGC AAA TGT GAA AAT GGC AAG 3 3 60 
Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 HlO 1115 1120 



65 TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC 3403 
Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 
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30 



50 



1125 1130 1135 

AAT CCT TGG AAA AAT ATC ATC CGT CCG GTT GTT TAT ATG TCC CGC TTA 3 4 5-^ 

Asn Pro Trp Lys Asn lie lie Arg Pro Val Val Tyr Met Ser Ara Leu 
1140 1145 1150 

TAT CTG CTA TGG CTG GAG CAG CAA TCA AAG .AAA AGT GAT GAT GGT AAA 3 504 

T.-'r Leu Leu Trp Leu Glu Gin Gin 3er Lys Lys Ser Asp Asp Gly Lys 
1155 1160 1165 

ACC ACG ATT TAT CAA TAT AAC TTA AAA CTG OCT CAT ATT CGT TAC GAC 3 552 

Thr Thr lie Tyr Gin Tyr Asn Leu Lys Leu Ala His He Ara lyr Asp 
1170 1175 U80 



15 GGT AGT TGG AAT ACA CCA TTT ACT TTT GAT GTG ACA GAA AAG GTA AAA 3 60C 

Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 
1135 1190 1195 1200 

AAT TAC ACG TCG AGT ACT GAT GCT GCT GAA TCT TTA GGG TTG TAT TGT 3 648 

20 Asn Tyr Thr Ser ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 

1205 1210 1215 

ACT GGT TAT CAA GGG GAA GAC ACT CTA TTA GTT ATG TTC TAT TCG ATG 3 69 6 

Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Met 
25 1220 1225 1230 

CAG AGT AGT TAT AGC TCC TAT ACC GAT AAT AAT GCG CCG GTC ACT GGG 37 4.! 

Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 



CTA TAT ATT TTC GCT GAT ATG TCA TCA GAC AAT ATG ACG AAT GCA CAA 3 79 2 
Leu Tyr He Phe Ala Asp Met Ser Ser Asp Asn Met Thr Asn Ala Gin 
1250 1255 1260 



35 GCA ACT AAC TAT TGG AAT AAC AGT TAT CCG CAA TTT GAT ACT GTG ATG 3 34 0 
Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
1265 1270 1275 1280 

GCA GAT CCG GAT AGC GAC AAT AAA AAA GTC ATA ACC AGA AGA GTT AAT 3 888 
40 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 

1285 1290 1295 

AAC CGT TAT GCG GAG GAT TAT GAA ATT CCT TCC TCT GTG ACA AGT AAC 3 93 6 
Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr Ser Asn 
45 1300 1305 1310 

AGT AAT TAT TCT TGG GGT GAT CAC AGT TTA ACC ATG CTT TAT GGT GGT 3984 
Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 
1315 1320 1325 



AGT GTT CCT AAT ATT ACT TTT GAA TCG GCG GCA GAA GAT TTA AGG CTA 403 2 
Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 1340 



55 TCT ACC AAT ATG GCA TTG AGT ATT ATT CAT AAT GGA TAT GCG GGA ACC 4080 
Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 
1345 1350 1355 1360 

CGC CGT ATA CAA TGT AAT CTT ATG AAA CAA TAC GCT TCA TTA GGT GAT 4128 
60 Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 

1365 1370 1375 

AAA TTT ATA ATT TAT GAT TCA TCA TTT GAT GAT GCA AAC CGT TTT AAT 417 6 
Lys Phe lie He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 
65 1380 1385 1390 
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CTG GTG CCA TTG TTT AAA TTC GGA AAA GAC GAG AAC TCA GAT GAT ACT 12- \ 

Leu Val Pro Leu Phe Lys Fhe Gly Lys Asp Glu Asn Ser Asp Asp Ser 
1395 1400 ' 1405 

5 ATT TGT ATA TAT AAT GAA AAC CCT TCC TCT GAA GAT AAG AAG TGG TAT 42" _ 

lie Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 

1410 1415 1420 

TTT TCT TCG AAA GAT GAC AAT AAA AC A GCG GAT TAT AAT GGT GGA ACT 43 2 

10 Fhe .= er Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 
1425 1430 1435 1440 

CAA TGT ATA GAT GCT GGA ACC ACT AAC AAA GAT TTT TAT TAT AAT CTC 43 6 

Gin Cys lie Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn L&u 
15 1445 1450 1455 

CAG GAG ATT GAA GTA ATT AGT GTT ACT GGT GGG TAT TGG TCG AGT TAT 44 1-, 

Gin Glu He Glu Val lie Ser Val Thr Gly Gly T/r Trp Ser Ser T/r 
1460 1465 1470 

20 

AAA ATA TCC AAC CCG ATT AAT ATC AAT ACG GGC ATT GAT AGT GCT AAA 44 6 1 

Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
1475 1480 1485 

25 GTA AAA GTC ACC GTA AAA GCG GGT GGT GAC GAT CAA ATC TTT ACT GCT 4 51.J 

Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 

1490 1495 1500 

GAT AAT AGT ACC TAT GTT CCT CAG CAA CCG GCA CCC AGT TTT GAG GAG 456 V 

30 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro ser Phe Glu Glu 
1505 1510 1515 1520 

ATG ATT TAT CAG TTC AAT AAC CTG AC A ATA GAT TGT AAG AAT TTA AAT 4 6 Si 

Met He Tyr Gin Phe Asn Asn Leu Thr He Asp cys Lys Asn Leu Asn 
35 1525 1530 1535 

TTC ATC GAC AAT CAG GCA CAT ATT GAG ATT GAT TTC ACC GCT ACG GCA 46~v 

Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 
1540 1545 1550 



40 



60 



CAA GAT GGC CGA TTC TTG GGT GCA GAA ACT TTT ATT ATC CCG GTA ACT 47 04 
Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 
1555 1560 1565 



45 AAA AAA GTT CTC GGT ACT GAG AAC GTG ATT GCG TTA TAT AGC GAA AAT 47 52 
Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu T/r Ser Glu Asn 
1570 1575 1580 

AAC GGT GTT CAA TAT ATG CAA ATT GGC GCA TAT CGT ACC CGT TTG AAT 4 300 
50 Asn Gly Val Gin Tyr Met Gin He Gly Ala T/r Arg Thr Arg Leu Asn 
1585 1590 1595 1600 

ACG TTA TTC GCT CAA CAG TTG GTT AGC CGT GCT AAT CGT GGC ATT GAT 4 3 43 
Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 
55 1605 1610 1615 

GCA GTG CTC AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAA TTA GGA 4 396 
Ala Val Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 
1620 1625 1630 



GCG GGC AC A TAT GTG CAG CTT GTG TTG GAT AAA TAT GAT GAG TCT ATT 4 9 44 
Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys T/r Asp Glu Ser He 
1635 1640 1645 



65 CAT GGC ACT AAT AAA AGC TTT GCT ATT GAA TAT GTT GAT ATA TTT AAA 4 992 
His Gly Thr Asn Lys Ser Phe Ala He Glu T/r Val Asp He Phe Lys 
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165;: 1555 1660 

GAG AAC GAT AGT TTT GTG ATT TAT CAA GGA GAA CTT AGC GAA ACA AGT 504 C 
Glu Asn Asp Ser Phe Val He Tyr Gin Gly Glu Leu 5er Glu Thr Ser 
1665 1670 1675 1630 

CAA ACT GTT GTG AAA CTT TTC TTA TCC TAT TTT ATA GAG GCG ACT GGA 50 3 3 
Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe lie Glu Ala Thr Gly 
1625 1690 1695 

AAT AAG AAC CAC TTA TGG GTA CGT GCT AAA TAC CAA AAG GAA ACG ACT 513 5 
Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 
1700 1705 1710 

15 GAT AAG ATC TTG TTC GAC CGT ACT GAT GAG AAA GAT CCG CAC GGT TGG 5184 
Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
1715 1720 1725 

TTT CTC AGC GAC GAT CAC AAG ACC TTT AGT GGT CTC TCT TCC GCA CAO 523 2 
2U Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 1740 

GCA TTA AAG AAC GAC AGT GAA CCG ATG GAT TTC TCT GGC GCC AAT GCT 52 30 
Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Aia Asn Ala 
25 1745 1750 1755 1750 

CTC TAT TTC TGG GAA CTG TTC TAT TAC ACG CCG ATG ATG ATG GCT CAT 532 3 

Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Met Met Met Ala His 
1755 1770 1775 

30 

CGT TTG TTG CAG GAA CAG AAT TTT GAT GCG GCG AAC CAT TGG TTC CGT 537 6 

Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Ara 
1780 1785 1790 

35 TAT GTC TGG AGT CCA TCC GGT TAT ATC GTT GAT GGT AAA ATT GCT ATC 5424 
Tyr Val Trp Ser Pro Ser Gly Tyr He Val Asp Gly Lys He Ala He 
1795 1800 1805 

TAC CAC TGG AAC GTG CGA CCG CTG GAA GAA GAC ACC AGT TGG AAT GCA 547 2 
40 Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 
1810 1815 1820 

CAA CAA CTG GAC TCC ACC GAT CCA GAT GCT GTA GCC CAA GAT GAT CCG 5520 
Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 
45 1825 1830 1335 1840 

ATG CAC TAC AAG GTG GCT ACC TTT ATG GCG ACG TTG GAT CTG CTA ATG 55*3 
Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Met 
1845 1850 1855 



50 



GCC CGT GGT GAT GCT GCT TAC CGC CAG TTA GAG CGT GAT ACG TTG GCT 56 i 5 
Ala Arg Gly Asp Ala Aia Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 
1860 1865 1870 



55 GAA GCT AAA ATG TGG TAT ACA CAG GCG CTT AAT CTG TTG GGT GAT GAG 566 4 

Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 
1875 1380 1885 

CCA CAA GTG ATG CTG AGT ACG ACT TGG GCT AAT CCA ACA TTG GGT AAT 5712 

60 Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

GCT GCT TCA AAA ACC ACA CAG CAG GTT CGT CAG CAA GTG CTT ACC CAG 57 60 

Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 

65 1905 1910 1915 1920 
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7T3 CGT :~r AAT AGC AGG GTA AAA ACC CCG TTG CTA GGA AC A GCC AAT = 4 03 

Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 
1325 1930 1935 

5 TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT AGC AAG CTC AAA GGC 5 356 

Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lvs Gly 

1940 1945 1950 

TAC TGG CGG AC A CTG GCG CAG CGT ATG TTT AAT TTA CGT CAT AAT CTG 59 04 

10 Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn Leu Arg His Asn Leu 
1955 I960 1965 

TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG TAT GCT AAA CCG GCT 5 9 52 

3er lie Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 
15 1970 1975 1980 

GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA GCT TCT CAA GGG GGA 6 000 

Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 
1935 1990 1995 2000 



20 



40 



GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC CGC TTC. CCT CAA ATG 6048 
Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His Arg Phe Pro Gin Met 
2005 2010 2015 



25 CTA GAA GGG GCA CGG GGC TTG GTT AAC CAC CTT ATA CAG TTC GGT AGT 609 6 
Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 
2020 2025 2030 

TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG GAA GCT ATG AGT CAA 6144 
30 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 
2035 2040 2045 

CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG ACC AGT ATT CGT ATG 619 2 
Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser lie Arg Met 
35 2050 2055 2060 

CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA AAA ACC GCC TTG CAA 62 4 0 
Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 
2055 2070 2075 2080 



GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC AGC TAT AGC CAA CTG 6288 
Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 
2085 2090 2095 



45 TAT GAG GAG AAC ATC AAC GCA GGT GAG CAC CGA GCG CTC GCG TTA CGC 6 3 36 
Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 
2100 2105 2110 

TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG ATT TCC CGT ATG GCA 6384 
50 Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Met Ala 
2115 2120 2125 

GGC GCG CGT GTT GAT ATG GCA CCA AAT ATC TTC GGC CTG GCT GAT GGC 643 2 
Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Gly 
55 2130 2135 2140 

GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC GCT GAC GGT ATT GAG 64 80 
Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 
2145 2150 2155 2160 

60 

TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG AAA GTT GCT CAG TCG 652 8 
Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin S r 
2165 2170 2175 

65 GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA ATT CAG CGT GAC AAC 6 57 6 
Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
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2130 2135 2190 

OCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA CTG GAA TCA CTG TCT 65; i 

Ala Gin Ala Glu lie Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 

5 2195 2200 2205 

ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG TAC CTG AAA ACC CAG 56" 2 

He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 

2210 2215 2220 

10 

CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA AG A AGO .AAA TTC AGT 6' 2 0 

Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 

2225 2230 2235 2240 

15 AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT TTG TCA GGT ATT TAT 67 6 3 

Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly lie Tyr 

2245 2250 2255 

TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC CTG ATG GCA GAG CAA 63 i 6 

20 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 

2260 2265 2270 

TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT AGC TTT GTC AAA CCG 63 6 4 

Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 

25 2275 2280 2285 

GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG TGT GGA GAA GCT TTG 69 12 

Gly Ala Trp Gin Giy Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 

2290 2295 2300 

30 

ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT CTG AAA TGG GAA TCT 6 9 60 

He Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr Leu Lys Trp Glu Ser 

2305 2310 2315 2320 

35 CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG GCA GTG GTT TAT GAT 7 00=5 

Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 

2325 2330 2335 

TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG GAA CAA ATA CCT GCA 7 05 6 

40 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin He Pro Ala 

2340 2345 2350 

TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT AAA GAA AAT GGG TTA 7 10 4 

Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 

45 2355 2360 2365 

TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC AAA TTG TCC GAC TTG 7152 

Ser Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 

2370 23"?5 2380 



50 



AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT GGT AGC AAC AAG GTT 7 2 00 
Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 
2335 2390 2395 2400 



55 CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT GCA TTG GTT GGG CCT 7 24 3 
Arg Arg lie Lys Gin He Ser Val Ser Leu Pro Ala Leu Val Gly Pro 
2405 2410 2415 

TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT GGC AGT ACT CAA TTG "29 6 
60 Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Giy Giy Ser Thr Gin Leu 
2420 2425 2430 

CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT GGT ACC AAT GAT AGT "34 4 
Pro Lys Gly Cvs Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
65 2435 2440 2445 
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GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA TAC CTG CCA TTT GAA "3 92 
Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 
2450 2455 2460 

5 GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT CTT CAA TTT CCG AAT 7 440 
Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 

GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT ATG AGC GAT ATT ATT 7 438 
10 Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Met Ser Asp lie He 

2485 2490 2495 

TTG CAT ATT CGT TAT ACC ATC CGT TAA 7515 
Leu His He Arg Tyr Thr He Arg * 
15 2500 2505 



(2) INFORMATION FOR SEQ ID NO: 12: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2505 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



25 (ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

30 Met Gin Asn Ser Leu Ser ser Thr lie Asp Thr He cys Gin Lys Leu 
15 10 15 



35 



Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 
20 25 30 

Arg Glu Lys Thr Arg Gly Met Val Asn Trp Gly Glu Ala Lys Arg He 
35 40 45 



Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 
40 50 55 60 

Arg He phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 
65 70 75 80 

45 Gly Thr Arg Gin Met Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 

85 90 95 



50 



Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala Ser Met 
100 105 110 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 
115 120 125 



Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 

55 130 135 140 

Leu Ala Ser Leu Met Leu Ser Gin Lys Asn Met Asp Glu Glu He Ser 

145 150 155 160 

60 Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 

165 170 175 



65 



Thr Gly Lys Ser Gin Asp Glu Val Met Asp Met Leu Ser Thr Tyr Arg 
180 185 190 
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L-?u Ser Gly Glu Thr Pro T/r His His Ala T/r Glu Thr Val Arg Glu 
195 200 205 

11° Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 
5 210 215 220 

He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser 
225 230 235 240 

10 His He Ser Pro Glu Leu Tyr Asn Leu Leu lie Glu Glu He Pro Glu 

245 250 255 



15 



30 



45 



Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 

260 265 270 

He Thr Thr Ala Gin Leu Met Ser Pro Ser T/r Leu Ala Arg Tyr Tyr 

275 280 285 



Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 
20 290 295 300 

Val Gly Tyr Ser Ser Asp He Leu Val lie Pro Leu Val Asp Gly Val 
305 310 315 320 

25 Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 

325 330 335 



Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 
340 345 350 

Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 
355 360 365 



Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 
35 370 375 380 

Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 
385 390 395 400 

40 Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 

405 410 415 



Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 

420 425 430 

Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 

435 440 445 



Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He 
50 450 455 460 

Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 
465 470 475 480 

55 Lys Val Tyr Ara Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 

485 490 495 



60 



Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 
500 505 S10 

Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 
515 520 525 



Asn Gly He Arg T/r Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 
65 530 535 540 
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Asn fro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 
545 550 555 560 

Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 
565 570 575 

Gin Mec Leu Leu lie Thr Asp Arg Lys Glu Asp Gly Val lie Lys Asn 
580 585 590 

Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Vai Ser Leu Leu Ala Gin 
595 600 605 

He His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val He Cys 
610 615 620 

Gly Tyr Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn Leu Ala 
625 630 635 640 

Lys He Val Glu Thr Lau Leu Trp He Thr Gin Trp Leu Lys Thr Gin 
645 650 655 

Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 
660 665 670 

Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 
675 680 685 

Thr Leu His Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys Arg Ala 
690 695 700 

Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 
705 710 715 720 

Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala Gin He 
725 730 735 

Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 
740 745 750 

Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 
755 760 765 

Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 
770 775 780 

Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 
785 790 795 800 

Thr Leu Met Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 
805 810 815 

Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 
820 825 830 

Val Thr Asp Val Ala Gin Aia Met Asn Lys Glu Glu Ser Leu Leu Gin 
835 840 845 

Met Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 
850 855 860 

Thr Gin He Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 
865 870 875 380 

Ala Val Ser Pro Leu Asp Leu Ala Gly Met Met Ala Leu Lys Tyr Gly 
885 890 895 
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lie Asp His Asn Tyr Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Mec 
900 905 910 

Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 
915 920 525 

Lys Ala Leu Cys Asn Tyr Tyr lie Asn Ala Val Val Asp Ser Ala Ala 
930 935 940 

Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 
945 950 955 S60 

Gin Val Ser Ala Asp Val He Thr Ser Arg lie Ala Glu Ala He Ala 
965 970 975 

Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 
980 985 990 

Leu Ala Ser Asp Val ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 
995 1000 1005 

Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 
1010 1015 1020 

Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 
1025 1030 1035 1040 

Mec Met Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 
1045 1050 1055 

Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 
1060 1065 1070 

Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 
1075 1080 1085 

Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 
1090 1095 1100 

Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 
1105 mo 1H5 H20 

Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 
1125 1130 H35 

Asn Pro Trp Lys Asn He He Arg Pro Val Val Tyr Met Ser Arg Leu 
1140 1145 1150 

Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 
1155 H60 1165 

Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 
1170 H75 1180 

Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 
1135 H90 1195 1200 

Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr cys 
1205 1210 1215 

Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Ser Mec 
1220 1225 1230 

Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 
1235 1240 1245 



-132- 

SUBSTITUTE SHEET (RULE 26) 



WO 97/17432 



PCI7US96/18003 



L*u Tyr lie Phe Ala Asp Met ser Ser Asp Asn Met Thr Asn *ia j : n 
1250 1255 1260 

Ala Thr Asn Tyr Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 
5 1265 1270 1275 i2 30 

Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Vai isn 
1235 1290 1295 

1(1 Asn Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Thr S»r Asn 
1300 1305 1310 



15 



Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 
1315 1320 1325 

Ser Val Pro Asn lie Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 
1330 1335 1340 



Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 
20 1345 1350 1355 i 350 

Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 
1365 1370 1375 

25 Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 
1380 1385 1390 



30 



Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Glu Asn Ser Asp Asp Ser 
1395 1400 1405 

He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 
1410 1415 1420 



Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Gly Gly Thr 
35 1425 1430 1435 1440 

Gin cys He Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 
1445 1450 1455 

40 Gin Glu He Glu Val He Ser Val Thr Gly Gly Tyr Trp Ser Ser Tyr 
1460 1465 1470 



45 



Lys He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
1475 1480 1485 

Val Lys Val Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 

1490 1495 1500 



Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 
50 1505 1510 1515 1520 

Met He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 
1525 1530 1535 

55 Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 
1540 1545 1550 



60 



Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 
1555 1560 1565 

Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 
1570 1575 1580 



Asn Gly Val Gin Tyr Met Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 
*5 1585 1590 1595 1600 



-133- 

SUBSTTTUTE SHEET (RULE 26) 



Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly lie Asp 
1605 1610 1615 

Ala Val L u Ser Mec Glu Thr Gin Asn lie Gin Glu Pro Gin Leu Gly 
1620 1625 1630 

Ala Gly Thr Tyr Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser lie 
1635 1640 1645 

His Gly Thr Asn Lys Ser Phe Ala lie Glu Tyr Val Asp lie Phe Lys 
1650 1655 1660 

Glu Asn Asp Ser Phe Val lie Tyr Gin Gly Glu Leu Ser Glu Thr Ser 
1665 1670 1675 1630 

Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe lie Glu Ala Thr Gly 
1685 1690 1695 

Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 
1700 1705 1710 

Asp Lys lie Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 
1715 1720 1725 

Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 
1730 1735 1740 

Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ala 
1745 1750 1755 1760 

Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Mec Met Met Ala His 
1765 1770 1775 

Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 
1780 1785 1790 

Tyr Val Trp Ser Pro ser Gly Tyr lie Val Asp Gly Lys lie Ala He 
1795 1800 1805 

Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 
1810 1815 1820 

Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 
1825 1830 1835 1340 

Met His Tyr Lys Val Ala Thr Phe Met Ala Thr Leu Asp Leu Leu Mec 
1845 1850 1855 

Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 
1860 1365 1870 

Glu Ala Lys Met Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 
1875 1880 1885 

Pro Gin Val Met Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 
1890 1895 1900 

Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 
1905 1910 1915 1920 

Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 
1925 1930 1935 

Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 
1940 1945 1950 
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Tyr Trp Arg Thr Leu Aia Gin Arg Met Phe Asn Leu Ara His Asn Leu 
1955 1360 1965 

Ser II Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 
5 1970 1975 I960 

Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 
1335 1990 1995 2000 

10 Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His Arg Phe Pro Gin Met 

2005 2010 2015 



15 



30 



45 



60 



Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu lie Gin Phe Gly Ser 
2020 2025 2030 

Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Met Ser Gin 
2035 2040 2045 



Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Met 
20 2050 2055 2060 

Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 
2065 2070 2075 2080 

25 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 

2085 2090 2095 



Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 
2100 2105 2110 

Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Met Ala 
2115 2120 2125 



Gly Ala Gly Val Asp Met Ala Pro Asn He Phe Gly Leu Ala Asp Gly 
35 2130 2135 2140 

Gly Met His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 
2145 2150 2155 2160 

40 Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu Lys Val Ala Gin Ser 

2165 2170 2175 



Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 
2180 2185 2190 

Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 

2195 2200 2205 



He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 
50 2210 2215 2220 

Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 
2225 2230 2235 2240 

55 Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 

2245 2250 2255 



Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met Ala Glu Gin 
2260 2265 2270 

Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 

2275 2280 2285 



Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala L u 
65 2290 2295 2300 
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lis Oln Asn Leu Aia Gin Met Giu Glu Ala Tyr Leu Lys Trp Glu Ser 
2305 2310 2315 2 320 

Arg Ala Leu Glu Val Glu Arg Thr Val 3er Leu Ala Val Val Tyr Asp 
5 2325 2330 2335 

Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin lie Pro Ala 
2340 2345 2350 

10 Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 
2355 2360 2365 



15 



Ser Leu Ala Asn Aia He Leu Ser Ala Ser Val Lys Leu ser Asp Leu 
2370 2375 2380 

Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 
2385 2390 2395 2 400 

Arg Arg Ile Lys Gin Ile Ser v<ai Ser Leu Pro Ala Leu VaJL Gly Pro 
20 2405 2410 2415 

Tyr Gin Asp Val Gin Ala Met Leu Ser Tyr Gly Gly Ser Thr Gin Leu 
2420 2425 2430 

25 Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 
2435 2440 2445 



30 



40 



50 



Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 
2450 2455 2460 

Gly Ile Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 
2465 2470 2475 2480 



Ala Thr Asp Lys Gin Lys Ala Ile Leu Gin Thr Met Ser Asp Ile Ile 
35 2485 2490 2495 



Leu His Ile Arg Tyr Thr Ile Arg * 

2500 2505 



(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
45 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



Leu Ile Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala 
55 l 5 10 



(2) INFORMATION FOR SEQ ID NO: 14: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYFE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 14: 

Met Gin Asn Ser Gin Thr Phe Ser Val Gly Glu Leu 
1 5 10 



<2> INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



35 



40 



55 



60 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 



Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr 
25 1 5 10 



(2) INFORMATION FOR SEQ ID NO: 16: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 

Met Gin Asn Ser Leu 
1 5 



45 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 

Ala Phe Asn He Asp Asp Val Ser Leu Phe 
1 5 10 
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■:; INFORMATION FOR SEQ ID NO: 13: 

( l > SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16 ammo acids 
5 (B) TYPE: amino acid 

(C! STRANDEDNESS : single 
(D) TOPOLOGY: linear 

MOLECULE TYPE: pepc ide 

10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Phe lie Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
15 1 5 10 15 



(2) INFORMATION FOR SEQ ID NO: 19: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE : peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

30 

He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He Gly Ser 
15 10 15 

Leu Gin Leu Phe lie 

35 20 



(2) INFORMATION FOR SEQ ID NO: 20: 

40 (i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

50 

Met Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly Pro 
1 5 10 



55 (2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 6 amino acids 

(B) TYPE: amino acid 

60 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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ixi) MOLECULE TYPE: peptide 

5 '.xi> SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Gly lie Asp Ala Val Leu Ser Met Glu Thr Gin Asn lie Gin Glu Pro 
i 5 10 15 

10 a In Leu Gly Ala Gly Thr Tyr Val Gin Leu 

20 25 



15 



30 



40 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 22 



li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 ( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 23 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23 



Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 
45 1 5 10 



(2) INFORMATION FOR SEQ ID NO: 24: 

50 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly 
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1 5 10 15 

Val Gin Tyr Mec Gin lie 
20 



2! INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6005 base pairs 
(B> TYPE: nucleic acid 
IC> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: RBS 

(B) LOCATION: 1..9 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 16.. 3 585 

(D) OTHER INFORMATION: /product = "P8' 



20 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT AC A CAA ACG TTG AAA GAA 51 
30 Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu 

1 5 10 

GCG CGC CGT GAT GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC 99 
Ala Arg Arg Asp Ala Leu Val Ala His Tyr lie Ala Thr Gin Val Pro 
35 15 20 25 

GCA GAT TTA AAA GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT 147 
Ala Asp Leu Lys Glu Ser lie Gin Thr Ala Asp Asp Leu Tyr Glu Tyr 
30 35 40 



40 



60 



CTG TTG CTG GAT ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG 195 
Leu Leu Leu Asp Thr Lys lie Ser Asp Leu Val Thr Thr Ser Pro Leu 
45 50 55 60 



45 TCC GAA GCG ATT GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG 24 3 

Ser Glu Ala lie Gly Ser Leu Gin Leu Phe lie His Arg Ala lie Glu 
65 70 75 

GGC TAT GAC GGC ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT 291 

50 Gly Tyr Asp Gly Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp 
80 85 90 

GAA CAG TTT TTA TAT AAC TOG GAT AGT TTT AAC CAC CGT TAT AGC ACT 33 9 

Glu Gin Phe Leu Tyr Asn Trp Asp ser Phe Asn His Arg Tyr Ser Thr 
55 95 100 105 

TGG GCT GGC AAG GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT 3 87 

Trp Ala Gly Lys Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr lie Asp 
110 115 120 



CCA AC A TTG CGA TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA 43 5 
Pro Thr Leu Arg Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin 
125 130 135 140 
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GGT ATT TCT CAA GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA 43 3 
Gly lie Ser Gin Gly Lys' Leu Lys Ser Glu Leu Val Glu Ser Lys Lau 
145 1^0 155 

5 CGT GAT TAT CTA ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT 531 
Arg Asp Tyr Leu He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr lie 
160 165 170 

ACT GCC TGC CAA GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT 57 9 
10 Thr Ala cys Gin Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg 
175 180 185 

ACA CAG AAT GCA CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC 627 
Thr Gin Asn Ala Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val 
15 190 195 200 

ACT GAT GGC GGT AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA 6" 5 
Thr Asp Gly Gly Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala 
205 210 215 220 



20 



40 



60 



ATT AAT GCC GGG ATT AGT GAG GCA TAT TCA GGG CAT GTC GAG CCT TTC 723 
He Asn Ala Gly He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe 
225 230 235 



25 TGG GAA AAT AAC AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA 771 
Trp Glu Asn Asn Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu 
240 245 250 

GAT AAA ATA GAT TTT GTT TAT AAA AAC ATC TGG GTG ATC AGT AGC GAT 819 
30 Asp Lys He Asp Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp 
255 260 265 

TAT AGC TGG GCA TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT GAC 8 67 
Tyr ser Trp Ala Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp 
35 270 275 280 

TAC AAT AGA GTT GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT 915 
Tyr Asn Arg Val Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala 
285 290 295 300 



TCA CAA TAT GGT TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT 963 
ser Gin Tyr Gly Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr 
305 310 315 



45 GTA CTT ATT TTT CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG 1011 
Val Leu He Phe Gin Asn Ala Giy Gly Ala Thr Pro Ser Thr Gly Val 
320 325 330 

ACG TTA TCT TAT GAC TCT GGC AAC GTG ATT AAG AAC CTA TCT AGT ACA 1059 
50 Thr Leu Cys Tyr Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr 
335 340 345 

GGA AGT GCA AAT TTA TCG TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC liO -1 
Gly Ser Ala Asn Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg 
55 350 355 360 

ATG TGT CAT GGA CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA 1155 
Met cys His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr 
365 370 375 330 

CTC TCT ATT AAT ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA 1203 
Leu Ser He Asn Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser 
385 390 395 



65 GAT GGA AAA CAA TTT ACA CCA CCT TCT GGT TCT CCC ATT GAT TTA CAC 1251 
Asp Gly Lys Gin Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His 
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400 405 410 

CTC CCT AAT TAT GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT 129 5 
Leu Pro Asn Tyr V<al Asp Leu Asn Ala Leu Leu Asp lie Ser Leu Asp 
415 420 425 

TCA CTA CTT AAT TAT GAC GTT CAG GGG CAG TTT GGC GGA TCT AAT CCG 13 47 
Ser Leu Leu Asn Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro 
430 435 440 

GTT GAT AAT TTC ACT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC 13 35 
Val Asp Asn Phe Ser Gly Pro Tyr Gly lie Tyr Leu Trp Glu He Phe 
445 450 455 460 

TTC CAT ATT CCG TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT 14 43 
Phe His He Pro Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg 
465 470 475 

TAC GAA GAC GCG GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT 1491 
Tyr Glu Asp Ala Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly 
480 435 490 

TAT CGC GAT GCT AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA CGT 1539 
Tyr Arg Asp Ala Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg 
495 500 505 

TAT TGG AAT GTG ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA 1537 
Tyr Trp Asn Val Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr 
510 515 520 

CAG CCC GCC ACC ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG 1635 
Gin Pro Ala Thr Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met 
525 530 535 540 

CAT TAC AAG CTG GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC 16 33 
His Tyr Lys Leu Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala 
545 550 555 

CGA GGC GAC AGC GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA 17 31 
Arg Gly Asp Ser Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu 
560 565 570 

GCC AAA ATG TAC TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT 177 9 
Ala Lys Met Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro 
575 580 585 

GAT ATC CAT ACC ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA 1827 
Asp He His Thr Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu 
590 595 600 

GCT GGC GCT ATT GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG 187 5 
Ala Gly Ala lie Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met 
605 610 615 620 

ACG TTC GCT GCC TGG CTA AGC GCA GGC GAT ACC GCA AAT ATT GGC GAC 192 3 
Thr Phe Ala Ala Trp Leu Ser Ala Gly Asp Thr Ala Asn He Gly Asp 
625 630 635 

GGT GAT TTC TTG CCA CCG TAC AAC GAT GTA CTA CTC GGT TAC TGG GAT 1971 
Gly Asp Phe Leu Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp 
640 645 650 

AAA CTT GAG TTA CGC CTA TAC AAC CTG CGC CAC AAT CTG AGT CTG GAT 2019 
Lys Leu Glu Leu Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp 
655 660 665 
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;GT CAA 
Gly Gin 
o?0 

5 ACC CTG 
Thr Leu 
635 

CCT GGT 
10 Ala Gly 



GAA CGC 
Glu Arg 

15 

TTA CAA 
Leu Gin 

20 

TTG CAG 
Leu Gin 
750 

25 CAA AAT 
Gin Asn 
765 

AGC CGT 

30 ser Arg 



AAC GGT 
Asn Gly 

35 

ACC GCC 
Thr Ala 

40 

GGA ATC 
Gly lie 
830 

45 TCG GAA 
Ser Glu 
845 

GGC GCC 
50 Gly Ala 



GGC TAT 
Gly Tyr 

55 

GAT AAC 
Asp Asn 

60 

CAA ATC 
Gin lie 
910 

65 GCG AAT 
Ala Asn 



PCT/US96/18003 



CCG CTA AAT 
Pro Leu Asn 



CAA CGC CAG 
Gin Arg Gin 



GGT CAA GGC 
Gly Gin Gly 
705 

GCC CGC TCT 
Ala Arg Ser 
720 

AC A ACG TTA 
Thr Thr Leu 
735 

ACT CAA CAG 
Thr Gin Gin 



AAT CTA AAA 
Asn Leu Lys 



GAT GGC GAC 
Asp Gly Asp 
785 

GGT CTA TCT 
Gly Leu Ser 
800 

ATG ATT ACC 
Met lie Thr 
815 

GCC AAC GCG 
Ala Asn Ala 



TCG GGA GCG 
Trp Gly Ala 



GGC ATC CAG 
Gly lie Gin 
865 

CAG CGT CGT 
Gin Arg Arg 
880 

GAA ATA ACC 

Glu lie Thr 
895 

ACG ATG GCA 

Thr Met .-.la 



GCC CAA GCG 
Ala Gin Ala 



CTG CCA CTG 
Leu Pro Leu 
675 

CAA GCC GGA 
Gin Ala Gly 
690 

AGT GTT CAG 
Ser Val Gin 



GCC GTG AGT 
Ala Val Ser 



GAA CAT CAG 
Glu His Gin 
740 

GAA GCC ATC 
Glu Ala He 
755 

GGA TTA CAA 
Gly Leu Gin 
770 

ACA TTG CGG 
Thr Leu Arg 



GCG GCA GAA 
Ala Ala Glu 



AAT GGC GTT 
Asn Gly Val 
820 

GTA CCT AAC 
Val Pro Asn 
835 

CCA TTA ATT 
Pro Leu He 
850 

GAT CAG AGC 
Asp Gin Ser 



CAG GAA GAA 
Gin Glu Glu 



CAA CTG GAT 
Gin Leu Asp 
900 

CAA AAA CAG 
Gin Lys Gin 
915 

ATT TAT GAC 
lie Tyr Asp 



TAT GCC ACG 
Tyr Ala Thr 



GGG GAC GGT 
Gly Asp Gly 
695 

GGC TGG CGC 
Gly Trp Arg 
710 

TTG TTG ACT 
Leu Leu Thr 
725 

GAT AAT GAA 
Asp Asn Glu 



CTG AAA CAT 
Leu Lys His 



CAC AGC CTG 
His Ser Leu 
775 

CAA AAA CAT 
Gin Lys His 
790 

ATC GCC GGT 
He Ala Gly 
805 

GCA ACG GGA 
Ala Thr Gly 



GTC TTC GGG 
Val Phe Gly 



GGC TCC GGG 
Gly Ser Gly 
855 

GCG GGC ATT 
Ala Gly He 
870 

TGG GCA TTG 
Trp Ala Leu 
885 

GCC CAG ATA 
Ala Gin He 



ATC ACG CTC 
He Thr Leu 



CTG CAA ACC 
Leu Gin Thr 



CCG GTA GAC 
Pro Val Asp 
680 

ACA GGC AGT 
Thr Gly Ser 



TAT CCG TTA 
Tyr Pro Leu 



CAG TTC GGC 

Gin Phe Gly 
730 

AAA ATG ACG 
Lys Met Thr 
745 

CAC CAC GAT 
Gin His Asp 
760 

ACC GCA TTA 
Thr Ala Leu 



TAC AGC GAC 
Tyr Ser Asp 



CTG ACA CTA 
Leu Thr Leu 
810 

TTG CTG ATT 
Leu Leu He 
825 

CTG GCT AAC 
Leu Ala Asn 
840 

CAA GCA ACC 
Gin Ala Thr 



TCA GAA GTG 
Ser Glu Val 



CAA CGG GAT 
Gin Arg Asp 
890 

CAA AGC CTG 
Gin Ser Leu 
905 

TCT GAA ACC 
Ser Glu Thr 
920 

ACT CGT TTT 
Thr Arg Phe 



CCO AAA 2C6~ 
Pro Lys 



AGT CCG 2115 
Ser Pro 
"00 

TTG GTA 2163 
Leu Val 
715 

AAC AGC 2211 
Asn Ser 



ATA CTG 2259 
He Leu 



ATA CAA 23 07 
He Gin 



CAG GCT 23 55 
Gin Ala 
"30 

CTG ATT 2403 
Leu He 
795 

CGC AGC 2451 
Arg Ser 



GCC GGC 2499 
Ala Gly 



GGT GGA 2547 
Gly Gly 



CAA GTT 2595 
Gin Val 
860 

ACA GCA 2643 
Thr Ala 
875 

ATT GCT 2691 
He Ala 



CAA GAG 27 39 
Gin Glu 



GAA CAA 2737 
Glu Gin 



ACC GGG 2835 
Thr Gly 
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10 



525 930 935 540 

CAG GCA CTG TAT AAC TGG ATG GCC GGT CGT CTC TCC GCG CTC TAT TAC 2 333 
Gin Ala Leu Tyr Asn Trp Met Ala Gly Arg Leu Ser Ala Leu Tyr Tyr 
945 950 955 

CAA ATG TAT GAT TCC ACT CTG CCA ATC TGT CTC CAG CCA AAA GCC GCA 29 3 1 
Gin Met Tyr Asp Ser Thr Leu Pro lie Cys Leu Gin Pro Lys Ala Ala 
960 965 970 

TTA GTA CAG GAA TTA GGC GAG AAA GAG AGC GAC AGT CTT TTC CAG GTT 29" 9 
Leu Val Gin Glu Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val 
975 980 935 



15 CCG GTG TGG AAT GAT CTG TGG CAA GGG CTG TTA GCA GGA GAA GGT TTA 302? 
Pro Val Trp Asn Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu 
990 995 1000 

AGT TCA GAG CTA CAG AAA CTG GAT GCC ATC TGG CTT GCA CGT GGT GGT 3 07 5 
20 Ser Ser Glu Leu Gin Lys Leu Asp Ala lie Trp Leu Ala Arg Gly Gly 
1005 1010 1015 1020 

ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3123 
lie Gly Leu Glu Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Glv 
25 1025 1030 1035 

AC A GGG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC GGG GAA ACG 3171 
Thr Gly Thr Leu Ser Glu Asn He Asn Lys Val Leu Asn Gly Glu Thr 
1040 1045 1050 

30 

GTA TCT CCA TCC GGT GGC GTC ACT CTG GCG CTG ACA GGG GAT ATC TTC 3 219 
Val Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe 
1055 1060 1065 

35 CAA GCA ACA CTC GAT TTG AGT CAG CTA GGT TTG GAT AAC TCT TAC AAC 3 2 67 
Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn 
1070 1075 1080 

TTG GGT AAC GAG AAG AAA CGT CGT ATT AAA CGT ATC GCC GTC ACC CTG 3315 
40 Leu Gly Asn Glu Lys Lys Arg Arg He Lys Arg He Ala Val Thr Leu 
1085 1090 1095 1100 

CCA ACA CTT CTG GGG CCA TAT CAA GAT CTT GAA GCC ACA CTG GTA ATG 3 3 63 
Pro Thr Leu Leu Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met 
45 1105 1110 1H5 

GGT GCG GAA ATC GCC GCC TTA TCA CAC GGT GTG AAT GAC GGA GGC CGG 3411 

Gly Ala Glu He Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg 

1120 1125 H30 

50 

TTT GTT ACC GAC TTT AAC GAC AGC CGT TTT CTG CCT TTT GAA GGT CGA 3 4 59 

Phe Val Thr Asp Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg 
1135 1140 1145 

55 GAT GCA ACA ACC GGC ACA CTG GAG CTC AAT ATT TTC CAT GCG GGT AAA 3 507 
Asp Ala Thr Thr Gly Thr Leu Glu Leu Asn He Phe His Ala Gly Lys 
1150 H55 1160 

GAG GGA ACG CAA CAC GAG TTG GTC GCG AAT CTG AGT GAC ATC ATT GTG 3 55 5 
60 Glu Gly Thr Gin His Glu Leu Val Ala Asn Leu Ser Asp He He Val 
1165 1170 1175 1180 

CAT CTG AAT TAC ATC ATT CGA GAC GCG TAA ATTTC TTT T C TTTGTCGATT 3 605 
His Leu Asn Tyr He He Arg Asp Ala * 
65 H85 1190 
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ACAGGTCCCT ATCAGGGGCC 
TCGATTACAA CGCTGTCACT 
5 CTGAATGCTG CCGGCCCTGA 
GGCAGACGGA CGGCTCCTGG 
TTCGCCATCG GCTGGCAATG 

lU 

CCACAATACG GTAATGACGA 
CTGAATGACC AAGGGCAACC 
15 TTGCCAATTT CCTATACCGT 
ATCGAATACT GGCAACCTGC 
CCGGACGGGC ATCTACACAT 

20 

AATGACCAAC AAATCGCCCA 
GTCAGCTATC AATATCGAGC 
25 CATCCCAATG TTACCGCACA 
ACAAGCCAGC CTGTTCGTAC 
TCTGGTCTTT GACCACGGTG 

30 

GGTACAGCGC AATGGTCTGT 
GTGCGTACTC GCCGCTTATG 
35 GGAGAAGCCA GTACCAATGA 
AAAAACGCCA GCGTCACCAC 
AGGCCAGTCA CCCAGCCACC 

40 

CCGACATGCC AACGCTTTQA 
GTTGATCTGC GGGGAGAAGG 
45 TATAAAGCTC CGCAACGTCA 
GCCCCACTGC CTACCCTACC 
GACGGCCAAC TGGATTGGGT 

50 

CCCGATGGAA AGTGGACGCA 
CCAAGCATCC AGTTCG CTG A 
55 CCGAAAAGCG TGCGTCTATA 
CCCCAATCCA CAGGTATCAC 
TTCAGTGATA TGCTCGGTTC 

60 

ACCTGTTGGC CGAATCTAGG 
AGCCAGCCCG AAAATAGCTT 
65 GGCACCACCG ACCTTATCTA 



TGTTATTAAG GAGTACTTTA TGCAGGATTC 
TCCCAAAGGT GGCGGTGCTA TCAATGGCAT 
TGGAATGGCC TCCCTATCTC TGCCATTACC 
ATTATCGCTG ATTTAC AGC A ACAGTGCAGG 
CGGTGTTATG TCCATTAGCC GACGCACCCA 
CACGTTCCTA TCCCCACAAG GCGAGGTCAT 
TGATATCCGT CAAGACGTTA AAACGCTGCA 
GACCCGCTAT CAAGCCCGCC AGATCCTGGA 
CTCCGGTCAA GAAGGACGCG CTTTCTGGCT 
CTTAGGGAAA ACCGCGCAGG CTTGTCTGGC 
GTGGTTGCTG GAAGAAACTG TGACGCCAGC 
CGAAGATGAA GCCCATTGTG ACGACAATGA 
GCGCTATCTG GTACAGGTGA ACTACAGGCA 
TGGATAACGC ACCTCCCGCA CCGGAAGAGT 
AGCGCGTACC TCACTTCATA CCGTGCCAAC 
ACGCCCGGAT ATCTTCTCTC GCTATGAATA 
TCAACAAGTG CTGATGTTTC ACCGCACCGC 
CGCCCCGGAA CTGGTTGGAC GCTTAATACT 
GTTGATTACC ATCCGTCAAT TAAGCC ATG A 
ACTAGAACTA GCCTGGCAAC GGTTTGATCT 
CGCACTAGAT AATTTTAACT CGCAGCAACG 
GTTGCCAGGT ATGCTGTATC AAGATCGAGG 
GGAAGACGGA GACAGCAATG CCGTCACTTA 
CAATTTGCAG GATAATGCCT CATTGATGGA 
TGTTACCGCC TCCGGTATTC GCGGATACCA 
CTTTACGCCA ATCAATGCCT TGCCCGTGCA 
CCTTACCGGG GCAGGCTTAT CTGATTTAGT 
TGCCAACCAG CGAAACGGCT GGCGTAAAGG 
CCTGCCTGTC ACAGGGACCG ATGCCCGCAA 
CGGTCAACAA CATCTGGTGG AAATCAAGGG 
GCATGGCCGT TTCGGTCAAC CACTAACTCT 
CAATCCCGAA CGGCTGTTTC TGGCGG ATAT 
TGCGCAATCC GGCTCTTTGC TCATTTATCT 
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ACCAGAACTA 3 -5 -5 5 

GGGAGAAGCA 3 "2= 

CCTTTCGACC 3 785 

TAATGGGCCT 3 34 5 

ACATCGCATT 3 905 

GAATATCGCC 3 96 5 

AGGCGTTACC 4 025 

TTTCAGTAAA 4085 

GATATCGACA 4145 

AAATCCGCAA 4 205 

CGGTGAACAT 4 265 

AAAAACCCCT 4 325 

ACATCAAACC 4385 

GGCTGTTTCA 4445 

ATGGGATGCA 4 505 

TGGTTTTGAA 4565 

GCTCATGGCC 4 62 5 

GGAATATGAC 4 68 5 

ATCGGACGGG 47 45 

GGAGAAAATC 4805 

TTATCAACTG 4365 

CGCTTGGTGG 4 92 5 

CGACAAAATC 4 985 

TATCAACGGA 504 5 

TAGTCAGCAA 5105 

ATATTTTCAT 5165 

GTTGATCGGG 5225 

AGAAGATGTC 5235 

ACTGGTGGCT 5345 

TAATCGCGTC 5405 

GTCAGGATTT 5465 

CGACGGCTCC 5525 

CAACCAAAGT 5585 
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10 



15 



25 



30 



45 



60 



GCTAATCAGT TTCATGCCCC GTTGACATTA GCGTTGCCAG AAGGCGTACA ATTTGACAAC 5-545 

ACTTGCCAAC TTCAAGTCGC CGATATTCAG GGATTAGGGA TAGCCAGCTT GATTCTGACT 5~ j5 

GTGCCACATA TCGCGCCACA TCACTGGCGT TGTGACCTGT CACTGACCAA ACCCTGGTTG 5" 5 5 

TTGAATGTAA TGAACAATAA CCGGGGCGCA CATCACACGC TACATTATCG TAGTTCCGCG 5325 

CAATTCTGGT TGGATGAAAA ATTACAGCTC ACCAAAGCAG GCAAATCTCC GGCTTGTTAT 5335 

CTGCCGTTTC CAATGCATTT GCTATGGTAT ACCGAAATTC AGGATGAAAT CAGCGGCAAC 59 4 5 

CGGCTCACCA GTGAAGTCAA CTACAGCCAC GGCGTCTGGG ATGGTAAAGA GCGGGAATTC 60 05 

(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1190 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 

Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 
20 25 30 



Glu Ser lie Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 

35 35 40 45 

Thr Lys lie Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 

50 55 60 

40 Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 

65 70 75 80 



Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 

85 90 95 

Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr ser Thr Trp Ala Gly Lys 
100 105 110 



Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr lie Asp Pro Thr Leu Arg 
50 115 120 125 

Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 

130 135 140 

55 Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 
145 150 155 160 



He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 
165 170 175 

Gly Lys Asp Asn Lys Thr lie Phe Phe He Gly Arg Thr Gin Asn Ala 
130 185 190 



Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 
65 195 200 205 
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Lys L.-3U Lys Pro Asp Gin Trp 5er Glu Trp Arg Ala He Asn Ala slv 
210 215 220 

5 lie Ser Glu Ala Tyr S r Gly His Val Glu Pro Phe Trp Glu Asn Asn 
225 230 235 240 

Lys Leu His lie Arg Trp Phe Thr He Ser Lys Glu Asp Lys lie Asp 
245 250 255 

10 

Phe Val Tyr Lys Asn He Trp Val Met ser Ser Asp Tyr Ser Trp Ala 
260 265 270 

Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 
15 275 230 285 

Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 
290 295 300 

20 Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr Val Leu He Phe 
305 310 315 320 



25 



40 



55 



Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 

325 330 335 

Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 

340 345 350 



Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 
30 355 360 365 

Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser He Asn 
370 375 380 

35 Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 
385 390 395 400 



Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 

405 410 415 

Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 

420 425 430 



Tyr Asp Val Gin Gly Gin Pha Gly Gly Ser Asn Pro Val Asp Asn Phe 
45 435 440 445 

Ser Gly pro Tyr Gly lie Tyr Leu Trp Glu He Phe Phe His He Pro 
450 455 460 

50 Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 
465 470 475 480 



Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 

485 490 495 

Asn Gly Gin Leu He Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 
500 505 510 



Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
60 515 520 525 

Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 
530 535 540 

65 Ala lie Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 
545 550 555 560 
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to 



25 



40 



55 



*la Tvr Arg Gin Leu Glu Arg Asc Thr Leu Val Glu Ala Lys Met Tyr 
565 570 575 

Tyr lie Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp lie His Thr 
530 585 590 

Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala lie 
595 600 605 

Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 
610 615 620 



Trp Leu Ser Ala Gly Asp Thr Ala Asn lie Gly Asp Gly Asp Phe Leu 

15 625 630 635 640 

Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu 

645 650 655 

20 Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu 

660 665 670 



Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg 

675 680 685 

Gin Gin Ala Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Cln 

690 695 700 



Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg 

30 705 710 715 720 

Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr 
725 730 735 

35 Leu Glu His Gin Asp Asn Glu Lys Met Thr lie Leu Leu Gin Thr Gin 
740 745 750 



Gin Glu Ala lie Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu 

755 760 765 

Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly 

770 775 780 



Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu lie Asn Gly Gly Leu 

45 7S5 790 795 300 

Ser Ala Ala Glu lie Ala Gly Leu Thr Leu Arg Ser Thr Ala Met lie 

805 810 815 

50 Thr Asn Gly Val Ala Thr Gly Leu Leu He Ala Gly Gly He Ala Asn 

820 825 830 



Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly 
835 840 845 

Ala Pro Leu He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He 
850 855 860 



Gin Asp Gin Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg 

60 365 870 875 380 

Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He 

885 890 895 

65 Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Glu Gin lie. Thr Met 

900 905 910 
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Al* Cln Lys Gin lie Thr Leu Ser Clu Thr Glu Cln Ala Asn Ala cln 



5 Ala He Tyr Asp Leu Gin Thr Thr Aro Phe Thr Gi- -in , 

930 q,- - nr 0i ' '-- in Ala Leu Tyr 

*- :) 540 

Asn Trp Met Ala cly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp 
0 955 360 

Ser Thr Leu Pro lie Cys Leu cln Pro Lys Ala Ala Leu Val Gin Giu 



975 



,5 ° ly G1U %' Q G1U Ser AS P S.r Leu Phe Gin Val Pro Val Tr P Asn 

990 

Asp Leu Trp Cln cly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu 

iooo 1005 

20 cln Lys Leu Asp Ala lie Trp Leu Ala Ar g cly cly Ile Gly Leu Glu 

1015 1020 

Alj s Il. Arg Thr Val Se^Leu Asp Thr Leu Phe cly Thr Gly Thr Leu 
25 1035 1040 

Ser Glu Asn He Asn Lys Val Leu Asn Cly clu Thr Val Ser Pro Ser 

1050 1055 

3o Cly Gly val Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu 

1065 1070 

Asp Leu ser cln Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn clu 

1080 1085 

35 Lys Lys^Arg Arg He L ys Arg He Ala Val 

1100 

Cly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Met cly Ala Glu He 
40 1110 1115 1120 



1090 " r095 AX * ^ LSU Pr ° Thr Leu Leu 



Ala Ala Leu Ser Hi^Gly Val Asn Asp cicely Arg Phe Val Thr Asp 

45 ASP frJo^ ^ Pr ° f! , ?, GiU ^ Thr Thr 

1145 1150 

Cly Thr Leu Clu Leu Asn lie Phe His Ala Gly Lys Glu Gly Thr Gin 

1160 1165 

50 His Gl^Leu val Ala Asn Leu Ser Asp He He Val His Leu Asn Tyr 

11/5 1130 

He lie Arg Asp Ala * 
55 1135 1190 



<2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1881 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic: 
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i i:<) FEATURE: 

(A) NAME /KEY: CDS 
5 (B) LOCATION: 1 . .1881 

(D) OTHER INFORMATION: -producer "P8" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

10 

ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA GCG CGC COT GAT 43 
Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 

15 GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC GCA GAT TTA AAA 9 6 
Ala Leu Val Ala His Tyr lie Ala Thr Gin 7a 1 Pro Ala Asp Leu Lys 
20 25 30 

GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT CTG TTG CTG GAT 14 4 
20 Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 
35 40 45 

ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG TCC GAA GCG ATT 192 
Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 
25 50 55 60 

GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG GGC TAT GAC GGC 240 
Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 
65 70 75 80 



30 



50 



ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT GAA CAG TTT TTA 283 
Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 
35 90 95 



35 TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT TGG GCT GGC AAG 33 6 
Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 
100 105 110 

GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT CCA ACA TTG CGA 334 
40 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 
115 120 125 

TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA GGT ATT TCT CAA 43 2 
Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 
45 130 135 140 

GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA CGT GAT TAT CTA 430 
Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 
145 150 155 160 



ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT ACT GCC TGC CAA 523 
He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 
165 170 175 



55 GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT ACA CAG AAT GCA 57 6 
Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 
180 185 190 

CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC ACT GAT GGC GGT 62 4 
60 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 
195 200 205 

AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA ATT AAT GCC GGG 6" 2 
Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 
o5 210 215 220 
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ATT AST 
lie 5«?r 
225 

5 

AAG CTC 
Ly s L~u 



10 TTT GTT 
Phe 7a 1 



TCA AAG 
15 Ser Lys 



GGA GCA 
Gly Ala 
20 290 

TCT GAT 
Ser Asp 
305 

25 

CAG AAT 
Gin Asn 



30 GAC TCT 
Asp Ser 



TTA TCG 
35 Leu Ser 



CAA AGT 
Gin Ser 
40 370 

AC A ATA 
Thr lie 
385 

45 

TTT ACA 
Phe Thr 



50 GTA GAT 
Val Asp 



TAT GAC 
55 Tyr Asp 



AGT GGT 
Ser Gly 
60 450 

Phe Leu 
465 

65 

GAC ACT 



3AG GCA TAT 
Glu Ala Tyr 



CAC ATC CGT 
His He Arg 
245 

TAT AAA AAC 
Tyr Lys Asn 
260 

AAA AAA ATC 
Lys Lys He 
275 

ACA GGA TCA 
Thr Gly Ser 



GCT CAG ATG 
Ala Gin Met 



GCC GGC GGA 
Ala Gly Gly 
325 

GGC AAC GTG 
Gly Asn Val 
340 

TCA AAG GAT 
ser Lys Asp 
355 

TAC AAT GAT 
Tyr Asn Asp 



GAA TTC ACC 
Glu Phe Thr 



CCA CCT TCT 
Pro Pro Ser 
405 

CTC AAC GCG 
Leu Asn Ala 
420 

GTT CAG GGG 

Val Gin Gly 
435 

CCC TAT GGT 
Pro Tyr Gly 



GTT ACG GTC 
Val Thr Val 



TGG TAC AAA 



TCA GGG CAT 
Ser Gly His 
230 

TGG TTT ACT 
Trp Phe Thr 



ATC TGG GTG 
He Trp Val 



TTG GAA CTT 
Leu Glu Leu 
280 

TCA AGC CCG 
Ser Ser Pro 
295 

AAT ATT TCT 
Asn He Ser 
310 

GCT ACT CCC 
Ala Thr Pro 



ATT AAG AAC 
He Lys Asn 



TAT GCC ACA 
Tyr Ala Thr 
360 

AAT AAC TAC 
Asn Asn Tyr 
375 

TCC TAC GGC 
Ser Tyr Gly 
390 

GGT TCT GCC 
Gly Ser Ala 



CTA TTA GAT 
Leu Leu Asp 



CAG TTT GGC 
Gin Phe Gly 
440 

ATT TAT CTA 
He Tyr Leu 
455 

CGT ATG CAA 
Arg Met Gin 
470 

TAT ATT TTC 



**r^r* 

. ~jn^j 1 

Val Giu Pro 
235 

ATC TCG AAA 
lie Ser Lys 
250 

ATG AGT AGC 
Met Ser Ser 
265 

TCT TTT ACT 
Ser Phe Thr 



ACT GAA GTA 
Thr Glu Val 



GAT GAT GGG 
Asp Asp Gly 
315 

AGT ACT GGA 
Ser Thr Gly 
330 

CTA TCT AGT 
Leu Ser Ser 
345 

ACT AAA TTA 
Thr Lys Leu 



TGC AAT TTT 
Cys Asn Phe 



ACA TTC TCA 
Thr Phe Ser 
395 

ATT GAT TTA 
He Asp Leu 
410 

ATT AGC CTC 
He Ser Leu 
425 

GGA TCT AAT 
Gly ser Asn 



TGG GAA ATC 
Trp Glu He 



ACC GAA CAA 
Thr Glu Gin 
475 

CGC AGC GCC 
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TTC TGG GAA 
Phe Trp Glu 



GAA GAT AAA 
Glu Asp Lys 



GAT TAT AGC 
Asp Tyr Ser 
270 

GAC TAC AAT 
Asp Tyr Asn 
285 

GCT TCA CAA 
Ala Ser Gin 
300 

ACT GTA CTT 
Thr Val Leu 



GTG ACG TTA 
Val Thr Leu 



ACA GGA AGT 
Thr Gly Ser 
350 

CGC ATG TCT 
Arg Met Cys 
365 

ACA CTC TCT 
Thr Leu Ser 
380 

TCA GAT GGA 
Ser Asp Gly 



CAC CTC CCT 
His Leu Pro 



GAT TCA CTA 
Asp Ser Leu 
430 

CCG GTT CAT 
Pro Val Asp 
445 

TTC TTC CAT 
Phe Phe His 
460 

CGT TAC GAA 
Arg Tyr Glu 



GGT TAT CGC 



AAT A.AC ~ : : 
Asn Asn 
240 

ATA GAT "53 
lie Asp 
255 

TGG GCA 316 
Trp Ala 



AGA GTT 36>1 
Arg Val 



TAT GGT 912 
Tyr Gly 



ATT TTT 960 
He Phe 
320 

TGT TAT 1008 
Cys Tyr 
335 

GCA AAT 1056 
Ala Asn 



CAT GGA 1104 
His Gly 



ATT AAT 1152 
He Asn 



AAA CAA 1200 
Lys Gin 
400 

AAT TAT 1248 
Asn Tyr 
415 

CTT AAT 1236 
Leu Asn 



AAT TTC 13 44 
Asn Phe 



ATT CCG 13 92 
He Pro 



GAC GCG 1440 
Asp Ala 
480 

GAT GCT 1488 
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Asp Thr Trp Tyr Lys Tyr lie phe Arg Ser Ala Gly Tyr Arg Asp Al.a 
435 430 495 

AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA COT TAT TGG AAT GTG 153 6 
5 Asn Gly Gin Leu lie Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 
500 505 510 

ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA CAG CCC GCC ACC 153 4 
Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
10 515 520 525 

ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG CAT TAC AAG CTG 163 2 

Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 

530 535 540 

15 

GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC CGA GGC GAC AGC 168 0 

Ala He Phe Leu His Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 

545 550 555 560 

20 GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA GCC AAA ATG TAC 172 3 
Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 
565 570 575 

TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT GAT ATC CAT ACC 177 6 
25 Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 
580 585 590 

ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA GCT GGC GCT ATT 182 4 
Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala lie 
30 595 600 605 

GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG ACG TTC GCT GCC 187 2 
Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 
610 615 620 



35 



40 



50 



55 



TGG CTA AGC 1331 

Trp Leu Ser 

625 



(2) INFORMATION FOR SEQ ID NO: 28: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 627 amino acids 
45 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 
15 10 15 

Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 
20 25 30 



Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 
60 3 5 40 45 

Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 
50 55 60 

65 Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 
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Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 
35 90 95 



Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 
100 105 U0 

Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 
115 120 125 

Leu Asn Lys Thr Glu lie Phe Thr Ala Phe Glu Gin Gly He ser Gin 
130 135 140 

Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 
145 150 155 160 

He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala cys Gin 
165 170 175 

Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 
130 185 190 

Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 
195 200 205 

Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 
210 215 220 

He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 
225 230 235 240 

Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu Asp Lys He Asp 
245 250 255 

Phe Val Tyr Lys Asn lie Trp Val Met Ser Ser Asp Tyr ser Trp Ala 
260 265 270 

Ser Lys Lys Lys lie Leu Glu Leu Ser Phe Thr Asp Tyr Asn Arg Val 
275 280 285 

Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 
290 295 300 

Ser Asp Ala Gin Met Asn He Ser Asp Asp Gly Thr Val Leu He Phe 
305 310 315 320 

Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 
325 330 335 

Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 
340 345 350 

Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Met Cys His Gly 
355 360 365 

Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr Leu Ser He Asn 
370 375 380 

Thr lie Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 
385 390 395 400 

Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 
405 410 415 

Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 
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420 425 43 0 

Tyr Asp 7a 1 Gin Gly Gin Phe Gly Gly Ser Asn Pro 7a 1 Asp Asn F-hs 
435 440 445 

5 

Ser Gly Pro Tyr Gly lie Tyr Leu Trp Glu lie Phe Phe His lie Pro 
450 455 460 

Phe Leu Val Thr 7a 1 Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 
10 465 470 475 430 

Asp Thr Trp Tyr Lys Tyr lie Phe Arg Ser Ala Gly Tyr Arg Asp Ala 
485 490 435 

15 Asn Gly Gin Leu lie Met Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 
500 505 510 



20 



Met Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 
515 520 525 

Thr Asp Pro Asp Val He Ala Met Ala Asp Pro Met His Tyr Lys Leu 
530 535 540 



Ala lie Phe Leu His Thr Leu Asp Leu Leu lie Ala Arg Gly Asp Ser 

25 545 550 555 560 

Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu Ala Lys Met Tyr 

565 570 575 

30 Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp He His Thr 

580 585 590 



35 



Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu Ala Gly Ala He 

595 600 605 

Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Met Thr Phe Ala Ala 

610 615 620 



Trp Leu Ser 
40 625 



(2 1 INFORMATION FOR SEQ ID NO: 29: 

45 (i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1689 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



50 



(ii) MOLECULE TYPE: DNA (genomic] 



<ix) FEATURE: 
55 (A) NAME /KEY: CDS 

(B) LOCATION: 1..168 9 

(D) OTHER INFORMATION: /product= "S8" 



60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GCA GGC GAT ACC GCA AAT ATT GGC GAC GGT GAT TTC TTG CCA CCG TAC 43 
Ala Gly Asp Thr Ala Asn He Gly Asp Gly Asp Phe Leu Pro Pro Tyr 
15 10 15 
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AAC GAT 

Asn Asp 

5 

AAC CTG 
Asn Leu 



10 "TO TAT 
Leu T/r 
50 

GGA GGG 
15 Gly Gly 

65 

CAG GGC 
Gin Gly 

20 

AGT TTG 
Ser Leu 

25 

CAG GAT 
Gin Asp 



30 ATC CTG 
lie Leu 
130 

CAA CAC 
35 Gin His 
145 

CGG CAA 
Arg Gin 

40 

GAA ATC 
Glu lie 

45 

GTT GCA 
Val Ala 



50 AAC GTC 
Asn Val 
210 

ATT GGC 
55 lie Gly 
225 

AGC GCG 
Ser Ala 

60 

GAA TGG 
Glu Trp 

65 

GAT GCC 



GTA CTA CTC 
7a 1 Leu Leu 
20 

CGC CAC AAT 
Arg His Asn 
35 

GCC ACG CCG 
Ala Thr Pro 



GAC GGT AC A 
Asp Gly Thr 



TGG CGC TAT 
Trp Arg Tyr 
85 

TTG ACT CAG 

Leu Thr Gin 
100 

AAT GAA AAA 
Asn Glu Lys 
115 

AAA CAT CAG 
Lys His Gin 



AGC CTG ACC 
Ser Leu Thr 



AAA CAT TAC 
Lys His Tyr 
165 

GCC GGT CTG 
Ala Gly Leu 
180 

ACG GGA TTG 
Thr Gly Leu 
195 

TTC GGG CTG 
Phe Gly Leu 



TCC GGG CAA 
Ser Gly Gin 



GGC ATT TCA 
Gly lie Ser 
245 

GCA TTG CAA 
Ala Leu Gin 
260 

CAG ATA CAA 



GGT TAC TGG 
Gly T/r Trp 



CTG AGT CTG 
Leu Ser Leu 
40 

GTA GAC CCG 
Val Asp Pro 
55 

GGC AGT AGT 
Gly Ser Ser 

70 

CCG TTA TTG 
Pro Leu Leu 



TTC GGC AAC 
Phe Gly Asn 



ATG ACG ATA 
Mec Thr lie 
120 

CAC GAT ATA 
His Asp lie 
135 

GCA TTA CAG 
Ala Leu Gin 
150 

AGC GAC CTG 
Ser Asp Leu 



AC A CTA CGC 

Thr Leu Arg 



CTG ATT GCC 
Leu lie Ala 
200 

GCT AAC GGT 
Ala Asn Gly 
215 

GCA ACC CAA 
Ala Thr Gin 
230 

GAA GTG ACA 
Glu Val Thr 



CGG GAT ATT 
Arg Asp II 



AGC CTG CAA 



GAT AAA CTT 
Asp Lys Leu 
25 

GAT GGT CAA 
Asp Gly Gin 



AAA ACC CTG 
Lys Thr Leu 



CCG GCT GGT 
Pro Ala Gly 
75 

GTA GAA CGC 
Val Glu Arg 
90 

AGC TTA CAA 
Ser Leu Gin 
105 

CTG TTG CAG 
Leu Leu Gin 



CAA CAA AAT 
Gin Gin Asn 



GCT AGC CGT 
Ala Ser Arg 
155 

ATT AAC GGT 
lie Asn Gly 
170 

AGC ACC GCC 
Ser Thr Ala 
185 

GGC GGA ATC 
Gly Gly He 



GGA TCG GAA 
Gly Ser Glu 



GTT GGC GCC 
Val Gly Ala 
235 

GCA GGC TAT 
Ala Gly Tyr 
250 

GCT GAT AAC 
Ala Asp Asn 
265 

GAG CAA ATC 
-155- 



GAG TTA CGC 
Glu Leu Arg 
30 

CCG CTA AAT 
Pro Leu Asn 
45 

CAA CGC CAG 
Gin Arg Gin 
60 

GGT CAA GGC 
Gly Gin Gly 



GCC CGC TCT 
Ala Arg Ser 



ACA ACG TTA 
Thr Thr Leu 
110 

ACT CAA CAG 
Thr Gin Gin 
125 

AAT CTA AAA 
Asn Leu Lys 
140 

GAT GGC GAC 
Asp Gly Asp 



GGT CTA TCT 
Gly Leu Ser 



ATG ATT ACC 
Met He Thr 
190 

GCC AAC GCG 
Ala Asn Ala 
205 

TGG GGA GCG 
Trp Gly Ala 
220 

GGC ATC CAG 
Gly He Gin 



CAG CGT CGT 
Gin Arg Arg 



GAA ATA ACC 
Glu He Thr 
270 

ACG ATG GCA 



CTA TAC 
Leu T/r 



CTG CCA 14 4 
Leu Pro 



CAA GCC H-2 
Gin Ala 



AGT GTT 24 0 
Ser Val 
80 

GCC GTG 233 
Ala Val 
95 

GAA CAT 33 6 
Glu His 



GAA GCC 3 34 
Glu Ala 



GGA TTA 432 
Gly Leu 



ACA TTG 430 
Thr Leu 
160 

GCG GCA 523 
Ala Ala 
175 

AAT GGC 57 6 
Asn Gly 



GTA CCT 624 
Val Pro 



CCA TTA 67 2 
Pro Leu 



GAT CAG 72 0 
Asp Gin 
240 

CAG GAA 7 63 
Gin Glu 
255 

CAA CTG 31= 
Gin Leu 



CAA AAA S64 
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Asp Ala Gin li* Gin 3<=r Leu Gin CLu Gin lis Thr Met Ala Gin Lys 
2~5 230 235 

CAG ATC ACG CTC TCT GAA ACC GAA CAA GCG AAT GCC CAA GCG ATT TAT ?12 

5 Gin lis Thr Leu Ser Glu Thr Giu Gin Ala Asn Ala Gin Ala lie Tyr 
290 295 300 

GAC CTG CAA ACC ACT CGT TTT ACC GGG CAG GCA CTG TAT AAC TGG ATG 3 60 

Asp Leu Gin Thr Thr Arg Phe Thr Gly Gin Ala Leu Tyr Asn Trp Met 
II) 105 U0 3 15 320 

GCC CGT CGT CTC TCC GCG CTC TAT TAC CAA ATG TAT GAT TCC ACT CTG 10 03 

Ala Giy Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser Thr Leu 

325 330 335 

15 

CCA ATC TGT CTC CAG CCA AAA GCC GCA TTA GTA CAG GAA TTA GGC GAG 1056 

Pro lie Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu 

340 345 350 

20 .AAA GAG AGC GAC ACT CTT TTC CAG GTT CCG GTG TGG AAT GAT CTG TGG 1104 

Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn Asp Leu Trp 
355 360 365 

CAA GGG CTG TTA GCA GGA GAA GOT TTA AGT TCA GAG CTA CAG AAA CTG J. 152 

25 Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 
370 375 380 

GAT GCC ATC TGG CTT GCA CGT GGT GGT ATT GGG CTA GAA GCC ATC CGC 1200 

Asp Ala lie Trp Leu Ala Arg Gly Gly He Gly Leu Glu Ala He Arg 
30 385 390 395 400 

ACC GTG TCG CTG GAT ACC CTG TTT GGC AC A GGG ACG TTA AGT GAA AAT 12 43 

Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 

405 410 415 

35 

ATC AAT AAA GTG CTT AAC GGG GAA ACG GTA TCT CCA TCC GGT GGC GTC 129 6 

He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro ser Gly Gly Val 

420 425 430 

40 ACT CTG GCG CTG ACA GGG GAT ATC TTC CAA GCA ACA CTG GAT TTG AGT 13 44 

Thr Leu Ala Leu Thr Gly Asp lie Phe Gin Ala Thr Leu Asp Leu Ser 
435 440 445 

CAG CTA GGT TTG GAT AAC TCT TAC AAC TTG GGT AAC GAG AAG AAA CGT 13 9 2 

45 Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Lys Arg 
450 455 460 

CGT ATT AAA CGT ATC GCC GTC ACC CTG CCA ACA CTT CTG GGG CCA TAT 1440 

Arg He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu Gly Pro T/r 
50 465 470 475 480 

CAA GAT CTT GAA GCC ACA CTG GTA ATG GGT GCG GAA ATC GCC GCC TTA 14 83 

Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He Ala Ala Leu 

485 490 495 



55 



TCA CAC GGT GTG AAT GAC GGA GGC CGG TTT GTT ACC GAC TTT AAC GAC 153 6 
Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 
500 505 510 



60 AGC CGT TTT CTG CCT TTT GAA GGT CGA GAT GCA ACA ACC GGC ACA CTG 1584 
Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 
515 520 525 

GAG CTC AAT ATT TTC CAT GCG GGT AAA GAG GGA ACG CAA CAC GAG TTG 163 2 
65 Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 
530 535 540 
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~TC CCG AAT CTG ACT GAC ATC ATT 073 CAT CTG AAT TAC ATC ATT :GA 

Val Ala Asn Leu 3-sr Asp lie lie Val His Leu Asn Tyr lis II* a-? 
545 550 555 = 5; 

GAC GCG TAA 
Asp Ala * 



10 



20 



25 



40 



55 



(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 563 amino acids 
15 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Ala Gly Asp Thr Ala Asn lie Gly Asp GLy Asp Phe Leu Pro Pro Tyr 
15 10 15 

Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu Arg Leu Tyr 
20 25 30 



Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu Asn Leu Pro 

30 35 40 45 

Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg Gin Gin Ala 
50 55 60 

35 Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin Gly Ser Val 

65 70 75 80 



Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg Ser Ala Val 
85 90 95 

Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr Leu Glu His 
100 105 110 



Gin Asp Asn Glu Lys Met Thr lie Leu Leu Gin Thr Gin Gin Glu Ala 

45 115 120 125 

lie Leu Lys His Gin His Asp lie Gin Gin Asn Asn Leu Lys Gly Leu 

130 135 140 

50 Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly Asp Thr Leu 

145 150 155 160 



Arg Gin Lys His Tyr Ser Asp Leu lie Asn Gly Gly Leu Ser Ala Ala 
165 170 175 

Glu He Ala Gly Leu Thr Leu Arg Ser Thr Ala Met He Thr Asn Gly 
180 185 190 



Val Ala Thr Gly Leu Leu He Ala Gly Gly lie Ala Asn Ala Val Pro 
C>0 195 200 205 

Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly Ala Pro Leu 
210 215 220 

65 He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly He Gin Asp Gin 
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2 3 0 2 2 5 2 AC 

Ser Aia Gly lie Ser Glu Val Thr Ala Gly T/r Gin Ara Arg Gin Glu 

245 250 255 

5 

Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu He Thr Gin Leu 

260 265 270 

Asp Ala Gin He Gin Ser Leu Gin Glu Gin He Thr Met Ala Gin Lys 

U) 275 280 235 

Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn Ala Gin Ala He Tyr 

290 295 300 

15 Asp Leu Gin Thr Thr Arg Pbe Thr Gly Gin Ala Leu Tyr Asn Trp Met 

305 310 315 320 



20 



Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Met Tyr Asp Ser Thr Leu 
325 330 335 

Pro He Cys Leu Gin Pro Lys Ala Ala Leu Val Gin Glu Leu Gly Glu 
340 345 350 

Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn Asp Leu Trp 
25 355 360 365 

Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu Gin Lys Leu 
370 375 380 

30 Asp Ala He Trp Leu Ala Arg Gly Gly He Gly Leu Glu Ala He Arg 
335 390 395 400 



35 



50 



65 



Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu Ser Glu Asn 
405 410 415 

He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser Gly Gly Val 
420 425 430 



Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu Asp Leu Ser 

40 435 440 445 

Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu Lys Lys Arg 

450 455 460 

45 Arg He Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu Gly Pro Tyr 
465 470 475 480 



Gin Asp Leu Glu Ala Thr Leu Val Met Gly Ala Glu He Ala Ala Leu 
485 490 495 

Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp Phe Asn Asp 

500 505 510 



Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr Gly Thr Leu 
55 515 520 525 

Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin His Glu Leu 

530 535 540 

60 Val Ala Asn Leu Ser Asp He He Val His Leu Asn Tyr He He Arg 

545 550 555 560 



Asp Ala 
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iZ\ INFORMATION FOR SZQ ID NO : 3 i : 



HI SEQUENCE CHARACTERISTICS: 

1 A ) LENGTH: 4459 base pairs 
(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

ii) MOLECULE TYPE: DMA (genomic) 



lix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .4458 



(xil SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATG CAG GAT TCA CCA GAA GTA TCG ATT AC A ACG CTG TCA CTT CCC AAA 
Met Gin Asp Ser Pro Glu Val Ser He Thr Thr Leu Ser Leu Pro Lys 
15 10 15 

GGT GGC GGT GCT ATC AAT GGC ATG GGA GAA GCA CTG AAT GCT GCC GGC 
Gly Gly Gly Ala He Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly 
20 25 30 

CCT GAT GGA ATG GCC TCC CTA TCT CTG CCA TTA CCC CTT TCG ACC GGC 
Pro Asp Gly Mec Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 
35 40 45 

AGA GGG ACG GCT CCT GGA TTA TCG CTG ATT TAC AGC AAC AGT GCA GGT 
Arg Gly Thr Ala Pro Gly Leu Ser Leu lie -Tyr Ser Asn Ser Ala Gly 
50 55 60 

AAT GGG CCT TTC GGC ATC GGC TGG CAA TGC GGT GTT ATG TCC ATT AGC 
Asn Gly Pro Phe Gly lie Gly Trp Gin Cys Gly Val Met Ser lie Ser 
65 70 75 30 

CGA CGC ACC CAA CAT GGC ATT CCA CAA TAC GGT AAT GAC GAC ACG TTC 
Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 
85 90 95 

CTA TCC CCA CAA GGC GAG GTC ATG AAT ATC GCC CTG AAT GAC CAA GGG 
Leu Ser Pro Gin Gly Glu Val Met Asn He Ala Leu Asn Asp Gin Gly 
100 105 110 

CAA CCT GAT ATC CGT CAA GAC GTT AAA ACG CTG CAA GGC GTT ACC TTG 
Gin Pro Asp He Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 
115 120 125 

CCA ATT TCC TAT ACC GTG ACC CGC TAT CAA GCC CGC CAG ATC CTG GAT 
Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin lie Leu Asp 
130 135 140 

TTC AGT AAA ATC GAA TAC TGG CAA CCT GCC TCC GGT CAA GAA GGA CGC 
Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 
145 150 155 160 

GCT TTC TGG CTG ATA TCG ACA CCG GAC GGG CAT CTA CAC ATC TTA GGG 
Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His He Leu Gly 
165 170 175 

AAA ACC GCG CAG GCT TGT CTG GCA AAT CCG CAA AAT GAC CAA CAA ATC 
Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 
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GCC CAG 
Ala Gin 

5 

AGC TAT 
.?er T/r 
210 

id 

AAA ACC 
Lys Thr 
225 

15 AAC TAC 
Asn T/r 



GCA CCT 
20 Ala Pro 



GGT GAG 
Gly Glu 

25 

AC A GCG 
Thr Ala 
290 

30 

GGT TTT 
Gly Phe 
305 

35 CAC CGC 
His Arg 



GAA CTG 
40 Glu Leu 



ACC ACG 
Thr Thr 

45 

CCA GTC 
Pro Val 
370 

50 

GAG AAA 
Glu Lys 
385 

55 TCG CAG 
Ser Gin 



GGT ATG 
n() Gly Met 



CGT CAG 

Arg Gin 

n5 



is: 

TGG TTG CTG 
Trp Leu Leu 
135 

CAA TAT CGA 
Gin T/r Arg 



GCT CAT CCC 
Ala His Pro 



GGC AAC ATC 
Gly Asn lie 
245 

CCC GCA CCG 
Pro Ala Pro 
260 

CGC GAT ACC 
Arg Asp Thr 

275 

CAA TGG TCT 
Gin Trp Ser 



GAA GTG CGT 
Glu Val Arg 



ACC GCG CTC 
Thr Ala Leu 
325 

GTT GGA CGC 
Val Gly Arg 
340 

TTG ATT ACC 
Leu lie Thr 
355 

ACC CAG CCA 
Thr Gin Pro 



ATC CCG ACA 
lie Pro Thr 



CAA CGT TAT 
Gin Arg Tyr 
405 

CTG TAT CAA 
Leu Tyr Gin 
420 

GAA GAC GGA 
Glu Asp Gly 
435 



GAA GAA ACT 
Glu Giu Thr 
200 

GCC GAA GAT 
Ala Glu Asp 
215 

AAT GTT ACC 
Asn Val Thr 
230 

AAA CCA CAA 
Lys Pro Gin 



GAA GAG TGG 
Glu Glu Trp 



TCA CTT CAT 
Ser Leu His 
280 

GTA CGC CCG 
Val Arg Pro 
295 

ACT CGC CGC 
Thr Arg Arg 
310 

ATG GCC GGA 
Met Ala Gly 



TTA ATA CTG 
Leu lie Leu 



ATC CGT CAA 
He Arg Gin 
360 

CCA CTA GAA 
Pro Leu Glu 
375 

TGG CAA CGC 
Trp Gin Arg 
390 

CAA CTG GTT 
Gin Leu Val 



GAT CGA GGC 
Asp Arg Gly 



GAC AGC AAT 
Asp Ser Asn 
440 



135 

GTG AC 5 CCA 
Val Thr Pro 



GAA GCC CAT 

Glu Ala His 



GCA CAG CGC 
Ala Gin Arg 
235 

GCC AGC CTG 
Ala Ser Leu 
250 

CTG TTT CAT 
Leu Phe His 
265 

ACC GTG CCA 
Thr Val Pro 



GAT ATC TTC 
Asp He Phe 



TTA TCT CAA 
Leu Cys Gin 
315 

GAA GCC AGT 
Glu Ala Ser 
330 

GAA TAT GAC 
Glu Tyr Asp 
345 

TTA AGC CAT 
Leu Ser His 



CTA GCC TGG 
Leu Ala Trp 



TTT GAC GCA 
Phe Asp Ala 

395 

GAT CTG CGG 
Asp Leu Arg 
410 

GCT TGG TGG 
Ala Trp Trp 
425 

GCC GTC ACT 
Ala Val Thr 



130 

GCC GGT GAA 
Ala Gly Glu 
205 

TGT GAC GAC 

Cys Asp Asp 
220 

TAT CTG GTA 

T/r Leu Val 



TTC GTA CTG 
Phe Val Leu 



CTG GTC TTT 
Leu Val Phe 
270 

ACA TGG GAT 
Thr Trp Asp 
285 

TCT CGC TAT 
Ser Arg Tyr 
300 

CAA GTG CTG 
Gin Val Leu 



ACC AAT GAC 
Thr Asn Asp 



AAA AAC GCC 
Lys Asn Ala 
350 

GAA TCG GAC 
Giu Ser Asp 
365 

CAA CGG TTT 
Gin Arg Phe 
380 

CTA GAT AAT 
Leu Asp Asn 



GGA GAA GGG 
Gly Glu Gly 



TAT AAA GCT 
Tyr Lys Ala 
430 

TAC GAC AAA 
T/r Asp Lys 
445 



CAT GTC 62 4 
His Val 



AAT GAA 6" 2 
Asn Glu 



CAG GTG ~20 
Gin Val 
240 

GAT AAC "-5 3 
Asp Asn 
255 

GAC CAC S 16 
Asp His 



GCA GGT 864 
Ala Gly 



GAA TAT 912 
Glu Tyr 



ATG TTT 960 
Met Phe 
320 

GCC CCG 1003 
Ala Pro 
335 

AGC GTC 1056 
Ser Val 



GGG AGG 1104 
Gly Arg 



GAT CTG 1152 
Asp Leu 



TTT AAC 1200 
Phe Asn 
400 

TTG CCA 1248 
Leu Pro 
415 

CCG CAA 1296 
Pro Gin 



ATC GCC 134 4 
He Ala 
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rro cct Arc cta ccc aat ttg cag gat aat gcc tca ttc ato gat 
Pro Leu Pro Thr Lsu Pro Asn Leu Gin Asp Asn Ala ser Leu Mec Asp 
450 455 460 

5 ATC AAC GGA GAC GGC CAA CTG GAT TGG GTT GTT ACC GCC TCC GGT ATT 14 -4 0 
lie Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly lie 
465 470 475 430 

CGC GGA TAC CAT AGT CAG CAA CCC GAT GGA AAG TGG ACG CAC TTT ACG 14 33 
10 Ar? Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 

435 490 495 

CCA ATC AAT GCC TTG CCC GTG GAA TAT TTT CAT CCA AGC ATC CAG TTC 15S6 
Pro He Asn Ala Leu Pro Val Glu Tyr Phe His Pro ser He Gin Phe 
15 500 505 510 

GCT GAC CTT ACC GGG GCA GGC TTA TCT GAT TTA GTG TTG ATC GGG CCG 1584 
Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Val Leu He Gly Pro 
515 520 525 



20 



AO 



AAA AGC GTG CGT CTA TAT GCC AAC CAG CGA AAC GGC TGG CGT AAA GGA 16}; 
Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 
530 535 540 



25 GAA GAT GTC CCC CAA TCC AC A GGT ATC ACC CTG CCT GTC ACA GGG ACC 1530 

Glu Asp Val Pro Gin Ser Thr Gly lie Thr Leu Pro Val Thr Gly Thr 
545 550 555 560 

GAT GCC CGC AAA CTG GTG GCT TTC AGT GAT ATG CTC GGT TCC GGT CAA 17 23 

30 Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Met Leu Gly Ser Gly Gin 

565 570 575 

CAA CAT CTG GTG GAA ATC AAG GGT AAT CGC GTC ACC TGT TGG CCG AAT 177 6 

Gin His Leu Val Glu He Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 

35 580 585 590 

CTA GGG CAT GGC CGT TTC GGT CAA CCA CTA ACT CTG TCA GGA TTT AGC 1324 

Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 
595 600 605 

40 

CAG CCC GAA AAT AGC TTC AAT CCC GAA CGG CTG TTT CTG GCG GAT ATC 137 2 

Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp lie 
610 615 620 

45 GAC GGC TCC GGC ACC ACC GAC CTT ATC TAT GCG CAA TCC GGC TCT TTG 1920 

Asp Gly Ser Gly Thr Thr Asp Leu He Tyr Ala Gin Ser Gly Ser Leu 
625 630 635 640 

CTC ATT TAT CTC AAC CAA AGT GGT AAT CAG TTT GAT GCC CCG TTG ACA 19 63 

50 Leu He Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr 

645 650 655 

TTA GCG TTG CCA GAA GGC GTA CAA TTT GAC AAC ACT TCC CAA CTT CAA 2015 

Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Leu Gin 

55 660 665 670 

GTC GCC GAT ATT CAG GGA TTA GGG ATA GCC AGC TTG ATT CTG ACT GTG 2064 

Val Ala Asp He Gin Gly Leu Gly He Ala Ser Leu He Leu Thr Val 
675 680 635 



CCA CAT ATC GCG CCA CAT CAC TGG CGT TGT GAC CTG TCA CTG ACC AAA 2112 
Pro His He Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys 
690 695 700 



65 CCC TGG TTG TTG AAT GTA ATG AAC AAT AAC CGG GGC GCA CAT CAC ACG 2160 
Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr 
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"05 

CTA CAT 
Leu His 

5 

CTC ACC 
Leu Thr 

10 

CAT TTG 
His Leu 



15 CTC ACC 
Leu Thr 
770 

CGG GAA 
20 Arg Glu 
785 

TTT TCT 
Phe Ser 

25 

AGC TGG 
Sar Trp 

30 

GAA TAT 
Glu Tyr 



35 TAT ACC 
Tyr Thr 
850 

AAT GAG 
40 Asn Glu 
865 

CTA CGC 
Leu Arg 

45 

CCT TAT 
Pro Tyr 

50 

AAT AAA 
Asn Lys 

55 AGC TAC 
ser Tyr 
930 

ATC AAG 
60 rle Lys 
945 

ATT GCC 
He Ala 

65 



TAT CGT ACT 
Tyr Arg Ser 

725 

AAA GCA GGC 
Lys Ala Gly 
740 

CTA TGG TAT 
Leu Trp Tyr 
755 

ACT GAA GTC 
Ser Glu Val 



TTC AGA GGA 
Phe Arg Gly 



CAC GGC ACC 
His Gly Thr 
305 

TTT GCC ACC 
Phe Ala Thr 
820 

TGG CAG GCA 
Trp Gin Ala 
835 

GTC TGG GAT 
Val Trp Asp 



ACA CAA CGT 
Thr Gin Arg 



ACT GAG CTC 
Thr Glu Leu 
885 

ACC GTC AGT 
Thr Val Ser 
900 

GAA ACT GAA 
Glu Thr Glu 
915 

CAC TAT GAA 
His T/r Glu 



TTG CAA CAC 
Leu Gin His 



TGG CCG CGC 
Trp Pro Arg 
965 



710 

TCC GCG CAA 
Ser Ala Gin 



AAA TCT CCG 
Lys Ser Pro 



ACC GAA ATT 
Thr Glu He 
7 60 

AAC TAC AGC 
Asn Tyr Ser 
775 

TTT GGC TGC 
Phe Gly Cys 
790 

GCC CCC GAA 
Ala Pro Glu 



GGC ATG GAT 
Gly Met Asp 



GAC ACG CAA 
Asp Thr Gin 
840 

CAC ACC AAC 
His Thr Asn 
855 

AAC TGG CTG 
Asn Trp Leu 
870 

TAC GGT CTG 
Tyr Gly Leu 



GAA TCG CGC 
Glu Ser Arg 



TTA TCT GCC 
Leu Ser Ala 
920 

CGT ATC ATC 
Arg He He 
935 

GAT ATC TTT 
Asp He Phe 
950 

CGC GAA AAA 
Arg Glu Lys 



715 

TTC TGG TTG 
Phe Trp Leu 
"30 

GCT TGT TAT 
Ala Cys T/r 
745 

CAG GAT GAA 
Gin Asp Glu 



CAC GGC GTC 
His Gly Val 



ATC AAA CAG 
He Lys Gin 
795 

CAG GCG GCA 
Gin Ala Ala 
810 

GAA GTA GAC 
Glu Val Asp 
825 

GCT TAT AGC 
Ala Tyr ser 



CAG ACA GAC 
Gin Thr Asp 



ACG CGA GCG 
Thr Arg Ala 
875 

GAC GGA ACA 
Asp Gly Thr 
890 

TAT CAG GTA 
Tyr Gin Val 
905 

TGG GTC ACT 
Trp Val Thr 



ACT GAC CCA 
Thr Asp Pro 



GGT CAA TCA 
Gly Gin Ser 
955 

CCA GCA GTG 
Pro Ala Val 

970 
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GAT GAA AAA 
Asp Glu Lys 



CTG CCG TTT 
Leu Pro Phe 

750 

ATC AGC GGC 
He Ser Gly 
765 

TGG GAT GGT 
Trp Asp Gly 
780 

ACA GAT ACC 
Thr Asp Thr 



CCG TCG CTG 
Pro Ser Leu 



AGC CAA TTA 
Ser Gin Leu 
830 

GGA TTT GAA 
Gly Phe Glu 
845 

CAA GCA TTT 
Gin Ala Phe 
860 

CTT AAA GGC 
Leu Lys Gly 



GAT AAG CAA 
Asp Lys Gin 



CGC TCT ATT 
Arg Ser lie 
910 

GCT ATT GAA 
Ala He Glu 
925 

CAG TTC AGC 
Gin Phe Ser 
940 

CTG CAA AGT 
Leu Gin Ser 



AAT CCC TAC 

Asn Pro Tyr 



7L~ 

TTA CAG 2103 
Leu Gin 
~35 

CCA ATG 2256 
Pro Mec 



AAC CGG 2 3 0-1 
Asn Arg 



AAA GAG 23 52 
Lys Glu 



ACA ACG 24 00 
Thr Thr 
300 

AGT ATT 2443 
Ser He 
815 

GCT ACG 24 9 6 
Ala Thr 



ACC CGT 2544 
Thr Arg 



ACC CCC 259 2 
Thr Pro 



CA^ CTG 264 0 
Gin Leu 
330 

ACA GTG 2688 
Thr Val 
895 

CCC GTA 273 6 
Pro Val 



AAT CGC 27 3 4 
Asn Arg 



CAG AGT 23 3 2 
Gin Ser 



GTC GAT 288 0 
Val Asp 
960 

CCG CCT 292 3 
Pro Pro 
975 
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:TG ZZG GAA ACC CTA TTT GAT AGC ACC TAT 3 AT CAT CA_A CAA 7A.A J ■="■5 
Thr L=u Pro Glu Thr Leu Phe Asp 5er 7er Tyr Asp Asp Gin Gin Gin 
980 385 990 

5 CTA TTA CGT CTG GTG AG A CA.A AAA AAT AGC TGG CAT CAC CTG ACT GAT 3 02 4 
Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 
995 1000 1005 

CCC CAA AAC TGG CGA TTA GGT TTA CCG AAT GCA CAA CGC CGT GAT CTT 3 07 2 
10 Civ Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val 
1C1C 1015 1020 

TAT ACT TAT GAC CGG AGC AAA ATT CCA ACC GAA GGG ATT TCC CTT CAA 3120 
Tyr Thr Tyr Asp Arg Ser Lys lie Pro Thr Glu Gly lie Ser Leu Glu 
15 1025 1030 1035 1040 

ATC TTG CTG AAA GAT GAT GGC CTG CTA GCA GAT GAA AAA GCG GCC GTT 3163 

lie Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 
1045 1050 1055 

20 

TAT CTG GGA CAA CAA CAG ACG TTT TAC ACC GCC GGT CAA GCG GAA GTC 3216 

Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Val 

1060 L065 1070 

25 ACT CTA GAA AAA CCC ACG TTA CAA GCA CTG GTC GCG TTC CAA GAA ACC 3 2 64 
Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 
1075 1080 1085 

GCC ATG ATG GAC GAT ACC TCA TTA CAG GCG TAT GAA GGC GTG ATT GAA 3 312 
30 Ala Met Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val lie Glu 
1090 1095 1100 

GAG CAA GAG TTG AAT ACC GCG CTG AC A CAG GCC GGT TAT CAG CAA GTC 3 3 60 
Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 
35 1105 1110 1115 1120 

GCG CGG TTG TTT AAT ACC AGA TCA GAA AGC CCG GTA TGG GCG GCA CGG 3 4 08 
Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 
1125 1130 1135 



40 

CAA GGT TAT ACC GAT TAC GGT GAC GCC GCA CAG TTC TGG CGG CCT CAG 3 4 56 
Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 
1140 1145 1150 

45 GCT CAG CGT AAC TCG TTG CTG ACA GGG AAA ACC ACA CTG ACC TGG GAT 3 504 
Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 
1155 1160 1165 

ACC CAT CAT TGT GTA ATA ATA CAG ACT CAA GAT GCC GCT GGA TTA ACG 3 552 
50 Thr His His Cys Val lie lie Gin Thr Gin Asp Ala Ala Gly Leu Thr 
1170 1175 1130 

ACG CAA GCC CAT TAC GAT TAT CGT TTC CTT ACA CCG GTA CAA CTG ACA 3 500 
Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 
55 1185 1190 1195 1200 

GAT ATT AAT GAT AAT CAA CAT ATT GTG ACT CTG GAC GCG CTA GGT CGC 3 648 
Asp lie Asn Asp Asn Gin His lie Val Thr Leu Asp Ala Leu Gly Arg 
1205 1210 1215 



60 



GTA ACC ACC AGC CGG TTC TGG GGC ACA GAG GCA GGA CAA GCC GCA GGC 3 69 6 
Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 
1220 1225 1230 



65 TAT TCC AAC CAG CCC TTC ACA CCA CCG GAC TCC GTA GAT AAA GCG CTG 37 4 4 
T/r Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 
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1235 1240 1245 

GCA TTA ACC GGC CCA CTC CCT GTT GCC CAA TGT TTA GTC TAT GCC GTT 
Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val Tyr Ala Val 
5 L250 1255 1260 

GAT AGC TGG ATG CCG TCG TTA TCT TTG TCT CAG CTT TCT CAG TCA CAA 3 340 

Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu 3er Gin Ser Gin 
1265 1270 1275 1230 

10 

G.-.A "2AG GCA GAA GCG CTA TGG GCG CAA CTG CGT GCC GCT CAT ATG ATT 3 333 

Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met He 
1235 1290 1295 

15 ACC GAA GAT GGG AAA GTG TGT GCG TTA AGC GGG AAA CGA GGA AC A AGC 333 6 
Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 
1300 1305 1310 

CAT CAG AAC CTG ACG ATT CAA CTT ATT TCG CTA TTG GCA AGT ATT CCC 3 934 
20 His Gin Asn Leu Thr He Gin Leu lie Ser Leu Leu Ala Ser He Pro 
1315 1320 1325 

CGT TTA CCG CCA CAT GTA CTG GGG ATC ACC ACT GAT CGC TAT GAT AGC 4 03 2 
Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 
25 1330 1335 1340 

GAT CCG CAA CAG CAG CAC CAA CAG ACG GTG AGC TTT AGT GAC GGT TTT 4 030 
Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 
1345 1350 1355 1360 

30 

GGC CCG TTA CTC CAG AGT TCA GCT CGT CAT GAG TCA GGT GAT GCC TGG 4123 
Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 
1365 1370. 1375 

35 CAA CGT AAA GAG GAT GGC GGG CTG GTC GTG GAT GCA AAT GGC GTT CTG 417 6 
Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 
1380 1385 1390 

GTC AGT GCC CCT AC A GAC ACC CGA TGG GCC GTT TCC GGT CGC AC A GAA 4 22 4 
40 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 
1395 1400 1405 

TAT GAC GAC AAA GGC CAA CCT GTG CGT ACT TAT CAA CCC TAT TTT CTA 4 27 2 
Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 
45 1410 1415 1420 

AAT GAC TGG CGT TAC GTT AGT GAT GAC AGC GCA CGA GAT GAC CTG TTT 4320 
Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 
1425 1430 1435 1440 



50 

GCC GAT ACC CAC CTT TAT GAT CCA TTG GGA CGG GAA TAC AAA GTC ATC 4 3 63 

Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val lie 
1445 1450 1455 

55 ACT GCT AAG AAA TAT TTG CGA GAA AAG CTG TAC ACC CCG TGG TTT ATT 4416 

Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe He 
1460 1465 1470 

GTC AGT GAG GAT GAA AAC GAT ACA GCA TCA AGA ACC CCA TAG 4 4 53 

60 Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro * 
1475 1430 1435 



65 



(2) INFORMATION FOR SEQ ID NO: 32: 



164- 



SUBSTTTUTE SHEET (RULE 26) 



PCT/US96/18003 



■l) SEQUENCE CHARACTERISTICS : 

i A i LENGTH: 14 8 6 amino ac id: 
iB) TYPE: amino acid 
(Dl TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Gin Asp Set Pro Glu Val Ser lie Thr Thr Leu Ser Leu Pro Lys 
15 10 is 

Gly Gly Gly Ala lie Asn Gly Met Gly Glu Ala Leu Asn Ala Ala Gly 
20 25 30 

Pro Asp Gly Met Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 
35 40 45 

Arg Gly Thr Ala Pro Gly Leu Ser Leu lie Tyr Ser Asn Ser Ala Gly 
50 55 60 

Asn Gly Pro Phe Gly He Gly Trp Gin Cys Gly Val Met Ser lie Ser 

65 70 75 30 

Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 
85 90 95 

Leu Ser Pro Gin Gly Glu Val Met Asn lie Ala Leu Asn Asp Gin Gly 
100 105 no 

Gin Pro Asp lie Arg Gin Asp Val Lys Thr Leu Gin Gly Val Thr Leu 
115 120 125 

Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin He Leu Asp 
130 135 140 

Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 
145 150 155 160 

Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His He Leu Gly 
165 170 175 

Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 
180 185 190 

Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val 
195 200 205 

Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 
210 215 220 

Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Leu Val Gin Val 
225 230 235 240 

Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 
245 250 255 

Ala Pro pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 
260 265 270 

Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Ala Gly 
275 280 285 

Thr Ala Gin Trp Ser Val Arg Pro Asp He Phe Ser Arg Tyr Glu Tyr 
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250 295 300 

Gly Phe Glu val Arg Thr Arg Arg Leu Cys Gin Gin Val Leu Met Phe 
305 310 315 320 

5 

His Arg Thr Ala Leu Mec Ala Gly Glu Ala Ser Thr Asn Asp Ala Pro 
325 330 335 

Glu Leu Val Gly Arg Leu lie Leu Glu Tyr Asp Lys Asn Ala Ser 7a 1 
10 340 345 350 

Thr Thr Leu He Thr He Arg Gin Leu Ser His Glu Ser Asp Gly Arg 
355 360 365 

15 Pro Val Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 
370 375 380 



20 



Glu Lys He Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 
335 390 395 400 

Ser Gin Gin Arg Tyr Gin Leu Val Asp Leu Arg Gly Glu Gly Leu Pro 
405 410 415 

Gly Mec Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gin 
25 420 425 430 

Arg Gin Glu Asp Gly Asp Ser Asn Ala Val Thr Tyr Asp Lys He Ala 
435 440 445 

30 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Mec Asp 
450 455 460 



35 



50 



65 



He Asn Gly Asp Gly Gin Leu Asp Trp Val Val Thr Ala Ser Gly He 

465 470 475 480 

Arg Gly Tyr His ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 

485 490 495 



Pro He Asn Ala Leu Pro Val Glu Tyr Phe His Pro Ser He Gin Phe 

40 500 505 510 

Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Val Leu He Gly Pro 

515 520 525 

45 Lys Ser Val Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 
530 535 540 



Glu Asp Val Pro Gin Ser Thr Gly He Thr Leu Pro Val Thr Gly Thr 

545 550 555 560 

Asp Ala Arg Lys Leu Val Ala Phe Ser Asp Mec Leu Gly Ser Gly Gin 

565 570 575 



Gin His Leu Val Glu He Lys Gly Asn Arg Val Thr Cys Trp Pro Asn 
55 580 585 590 

Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 
555 600 605 

60 Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp He 
610 615 620 



Asp Gly Ser Gly Thr Thr Asp Leu He Tyr Ala Gin Ser Gly Ser Leu 
625 630 635 640 

Leu lie Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr 

-166- 



SUBSTITUTE SHEET (RULE 26) 



PCT/US96/18O03 



545 650 ^=5 

Leu Ala Leu Pro Glu Gly Val Gin Phe Asp Asn Thr Cys Gin Uu Gin 
660 665 670 

Val Ala Asp lie Gin Gly Leu Gly lie Ala Ser Leu He Leu Thr Val 
675 630 685 

Pro His He Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Lys 

690 655 7Q0 

Pro Trp Leu Leu Asn Val Met Asn Asn Asn Arg Gly Ala His His Thr 
705 710 715 ~2C 

Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lys Leu Gin 
725 730 735 

Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Met 
740 745 750 

His Leu Leu Trp Tyr Thr Glu He Gin Asp Glu He Ser Gly Asn Arg 
755 760 765 

Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu 
770 775 780 

Arg Glu Phe Arg Gly Phe Gly Cys He Lys Gin Thr Asp Thr Thr Thr 
785 790 795 800 

Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He 
805 810 815 

Ser Trp Phe Ala Thr Gly Met Asp Glu Val Asp Ser Gin Leu Ala Thr 
820 825 830 

Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 
835 840 845 

Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 
850 855 860 

Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gin Leu 
865 870 875 380 

Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val 
885 890 895 

Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser lie Pro Val 
900 905 910 

Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala He Glu Asn Arg 
915 920 925 

Ser Tyr His Tyr Glu Arg He He Thr Asp Pro Gin Phe Ser Gin Ser 
930 935 940 

He Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 
945 950 955 960 

He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro 
965 970 975 

Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 
980 935 990 

Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp 

-167- 



SUBSnTUTE SHEET (RULE 26) 



WO 97/17432 



PCT/US96/18003 



i-3~ 1300 1005 

Gly Olu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp Val 
1010 10*15 1020 

5 

Tyr Thr Tyr Asp Arg Ser Lys He Pro Thr Glu Gly He Ser Leu Glu 
1025 1030 1035 1040 

lis Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 
10 1045 1050 1055 

Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Cln Ala Glu Val 
1060 1065 1070 

15 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 
1075 1080 1085 



20 



35 



50 



65 



Ala Mec Met Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val He Glu 
1050 1095 1100 

Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Val 
1105 1110 1115 1120 



Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 
25 1125 1130 1135 

Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 
1140 1145 1150 

30 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 

1155 1160 1165 



Thr His His Cys Val He He Gin Thr Gin Asp Ala Ala Gly Leu Thr 
1170 1175 1180 

Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 
1185 1190 1195 1200 



Asp He Asn Asp Asn Gin His He Val Thr Leu Asp Ala Leu Gly Arg 
40 1205 1210 1215 

Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 
1220 1225 1230 

45 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys Ala Leu 

1235 1240 1245 



Ala Leu Thr Gly Ala Leu Pro Val Ala Gin cys Leu Val Tyr Ala Val 
1250 1255 1260 

Asp Ser Trp Met Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 
1265 1270 1275 1230 



Glu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met He 
55 1285 1290 1295 

Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 
1300 1305 1310 

60 His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser He Pro 
1315 1320 1325 



Arg Leu Pro Pro His Val Leu Gly He Thr Thr Asp Arg Tyr Asp Ser 
1330 1335 1340 

Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 
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i 345 



13 50 



1355 



Zly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 
1365 1370 13" 6 ; 

5 

Gin Arg Lys Glu Asp Gly Gly Leu Val Val Asp Ala Asn Gly Val Leu 
1380 1365 1390 

Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 
10 1355 1400 1405 

Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 
1410 1415 1420 

15 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 
1425 1430 1435 1440 



20 



Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val lie 
1445 1450 1455 

Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu Tyr Thr Pro Trp Phe lie 
1460 1465 1470 



Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro 
25 1475 1480 1435 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 3288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 ATG GTG ACT GTT ATG CAA AAT AAA ATA TCA TTT TTA TCA GGT ACA TCC 4 8 
Met Val Thr Val Met Gin Asn Lys He Ser Phe Leu Ser Gly Thr Ser 

15 10 15 

GAA CAG CCC CTC CTT GAC GCC GGT TAT CAA AAC GTA TTT GAT ATC GCA 96 
45 Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp He Ala 
20 25 30 

TCA ATC AGC CGG GCT ACT TTC GTT CAA TCC GTT CCC ACC CTG CCC GTT 144 
Ser I la Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 
50 35 40 45 

AAA GAG GCT CAT ACC GTC TAT CGT CAG GCG CGG CAA CGT GCG GAA AAT 152 
Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 
50 55 60 



55 



CTC AAA TCC CTC TAC CCA GCC TCC CAA TTG CGT CAG GAG CCG GTT ATT 240 
Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val He 

65 70 75 30 



nO AAA GGG CTG GCT AAA CTT AAC CTA CAA TCC AAC GTT TCT CTG CTT CAA 233 

Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Glr. 

35 90 95 

GAT GCT TTG GTA GAG AAT ATT GGC GGT GAT GGG GAT TTC AGC GAT TTA 3 36 

<S5 Asp Ala Leu Val Glu Asn He Gly Gly Asp Gly Asp Ph Ser Asp Leu 
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100 105 110 

ATG AAC CGT GCC ACT CAA TAT GCT GAC GCT GCC TCT ATT CAA TCC CTA 3 34 

Met Asn Arg Ala Ser Gin Tyr Ala A.sp Ala Ala Ser He Gin Ser Leu 

5 115 120 125 

TTT TCA CCG GGC CGT TAT GCT TCC GCA CTC TAC AG A GTT GCT AAA GAT 432 

Phe 5er Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 
130 135 140 

10 

CTG CAT AAA TCA GAT TCC ACT TTG CAT ATT GAT AAT CGC CGC GCT GAT 43 0 

Leu His Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg Aia Asp 
145 150 155 160 

15 CTG AAG GAT CTG ATA TTA AGC GAA ACG ACG ATG AAT AAA GAG GTC ACT 523 

Leu Lys Asp Leu He Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 
165 170 175 

TCC CTT GAT ATC TTG TTG GAT GTC CTA CAA AAA GGC GGT AAA GAT ATT 57 6 

20 Ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He 

130 185 190 

ACT GAG CTG TCC GGC GCA TTC TTC CCA ATG ACG TTA CCT TAT GAC GAT 6 24 

Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp 

25 195 200 205 

CAT CTG TCC CAA ATC GAT TCC GCT TTA TCG GCA CAA GCC AGA ACG CTG 67 2 

His Leu Ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 
210 215 220 



30 



50 



AAC GGT GTG TGG AAT ACT TTG ACA GAT ACC ACG GCA CAA GCG GTT TCA 7 20 
Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 
225 230 235 240 



35 GAA CAA ACC AGT AAT ACG AAT ACA CGC AAA CTG TTC GCT GCC CAA GAT 7 68 
Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 
245 250 255 

GGT AAT CAA GAT ACA TTT TTT TCC GGA AAC ACT TTT TAT TTC AAA GCG 816 
40 Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 
260 265 270 

GTG GGA TTC AGC GGG CAA CCT ATG GTT TAC CTG TCA CAG TAC ACC AGC 364 
Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 
45 275 280 285 

GGG AAC GGC ATT GTC GGC GCA CAA TTG ATT GCA GGT AAT CCA GAC CAA 912 
Gly Asn Gly He Val Gly Ala Gin Leu lie Ala Gly Asn Pro Asp Gin 
290 295 300 



GCC GCC GCC GCA ATA GTC GCA CCG TTG AAA CTC ACT TGG TCA ATG GCA 96C 
Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 
305 310 315 320 



55 AAA CAG TGT TAC TAC CTC GTC GCT CCC GAT GGT ACA ACG ATG GGA GAC 1008 

Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 

325 330 335 

GGT AAT GTT CTG ACC GGC TGT TTC TTA AGA GGC AAC AGC CCA ACT AAC 105 6 

60 Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 

340 345 350 

CCG GAT AAA GAC GGT ATT TTT GCT CAG GTA GCC AAC AAA TCA GCC AGT 1104 

Pro Asp Lys Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 

65 355 360 365 
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ACT CAG CCT TTG CCA AGC TTC CAT CTG CCG GTC AC A CTG GAA CAC AGC 1152 
Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 
370 375 380 

GAG AAT AAA GAT CAG TAC TAT CTG AAA AC A GAG CAG GGT TAT ATC ACG 12 00 
Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr lie Thr 
385 390 395 400 

GTA GAT AGT TCC GGA CAG TCA AAT TGG AAA AAC GCG CTG GTT ATC AAT 1248 
Val Asp ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val lie Asn 
405 410 415 

GGG AC A AAA GAC AAG GGG CTG TTA TTA ACC TTT TGC AGC GAT AGC TCA 12 96 
Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 
420 425 430 

GGC ACT CCG ACA AAC CCT GAT GAT GTG ATT CCT CCC GCT ATC AAT GAT 13 44 
Gly Thr Pro Thr Asn Pro Asp Asp Val lie Pro Pro Ala He Asn Asp 
435 440 445 

ATT CCA TCG CCG CCA GCC CGC GAA ACA CTG TCA CTG ACG CCG GTC AGT 1392 
He Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 
450 455 460 

TAT CAA TTG ATG ACC AAT CCG GCA CCG ACA GAA GAT GAT ATT ACC AAC 1440 
Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp He Thr Asn 
465 470 475 480 

CAT TAT GGT TTT AAC GGC GCT AGC TTA CGG GCT TCT CCA TTG TCA ACC 1488 
His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 
485 490 495 

AGC GAG TTG ACC AGC AAA CTG AAT TCT ATC GAT ACT TTC TGT GAG AAG 153 6 
Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 
500 505 510 

ACC CGG TTA AGC TTC AAT CAG TTA ATG GAT TTG ACC GCT CAG CAA TCT 1584 
Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin ser 
515 520 525 

TAC AGT CAA AGC AGC ATT GAT GCG AAA GCA GCC AGC CGC TAT GTT CGT 163 2 
Tyr Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 
530 535 540 

TTT GGG GAA ACC ACC CCA ACC CGC GTC AAT GTC TAC GGT GCC GCT TAT 1680 
Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr 
545 550 555 560 

CTG AAC AGC ACA CTG GCA GAC GCG GCT GAT GGT CAA TAT CTG TGG ATT 1728 
Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp lie 
565 570 575 

CAG ACT GAT GGC AAG AGC CTA AAT TTC ACT GAC GAT ACG GTA GTC GCC 177 6 
Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 
580 585 590 

TTA GCC GGT CGC GCT GAA AAG CTG GTA CGT TTA TCA TCC CAG ACC GGG 1824 
Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 
595 600 605 

CTA TCA TTT GAA GAA TTG GAC TGG CTG ATT GCC AAT GCC AGT CGT AGT 1872 
Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg Ser 
610 615 620 

GTG CCG GAC CAC CAC GAC AAA ATT GTG CTG GAT AAG CCG GTC CTT GAA 192 0 
Val Pro Asp His His Asp Lys II Val Leu Asp Lys Pro Val Leu Glu 
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425 530 635 -540 

CCA CTG GCA GAG TAT GTC AGC CTA AAA CAG CGC TAT GGG CTT GAT GCC 15-53 
Ala Lau Ala Glu Tyr Val 3er Leu Lys Gin Arg Tyr Gly Leu Asp Ala 
645 650 655 

AAT ACC TTT GCG ACC TTC ATT AGT GCA GTA AAT CCT TAT ACG CCA GAT 201-5 
Asn Thr Phe Ala Thr Phe lie Ser Ala Val Asn Pro Tyr Thr Pro Asp 
660 665 670 

CAG AC A CCC AGT TTC TAT GAA ACC GCT TTC CGC TCT GCC GAC GGT AAT 2 06 4 
Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 
675 680 685 

15 CAT GTC ATT GCG CTA GGT AC A GAG GTG AAA TAT GCA GAA AAT GAG CAG 2112 
His Val lie Ala Leu Gly Thr Glu Val Lys Tyr Ala Glu Asn Glu Gin 
690 695 700 

GAT GAG TTA GCC GCC ATA TGC TGC AAA GCA TTG GGT GTC ACC AGT GAT 2160 
20 Asp Glu Leu Ala Ala He Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 
705 710 715 720 

GAA CTG CTC CGT ATT GGT CGC TAT TGC TTC GGT AAT GCA GGC AGT TTT 2208 
Glu Leu Leu Arg He Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe 
25 725 730 735 

ACC TTG GAT GAA TAT ACC GCC AGT CAG TTG TAT CGC TTC GGC GCC ATT 2256 
Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala He 
740 745 750 



30 



50 



CCC CGT TTG TTT GGG CTG AC A TTT GCC CAA GCC GAA ATT TTA TGG CGT 2304 
Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu He Leu Trp Arg 
755 760 765 



35 CTG ATG GAA GGC GGA AAA GAT ATC TTA TTG CAA CAG TTA GGT CAG GCA 23 52 
Leu Mec Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 
770 775 780 

AAA TCC CTG CAA CCA CTG GCT ATT TTA CGC CGT ACC GAG CAG GTG CTG 2400 
40 Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 
735 790 795 800 

GAT TGG ATG TCG TCC CTA AAT CTA AGT CTG ACT TAT CTG CAA GGG ATG 2 448 

Asp Trp Mec ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Met 
45 805 810 815 

GTA AGT ACG CAA TGG AGC GGT ACC GCC ACC GCT GAG ATG TTC AAT TTC 2496 
Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Met Phe Asn Phe 
820 825 830 



TTG GAA AAC GTT TGT GAC AGC GTG AAT AGT CAA GCT GCC ACT AAA GAA 2544 
Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 
835 840 845 



55 ACA ATG GAT TCG GCG TTA CAG CAG AAA GTG CTG CGG GCG CTA AGC GCC 2592 

Thr Met Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 

850 855 860 

GGT TTC GGC ATT AAG AGC AAT GTG ATG GGT ATC GTC ACC TTC TGG CTG 264 0 

60 Gly Phe Gly He Lys Ser Asn Val Met Gly He Val Thr Phe Trp Leu 

365 370 875 880 

GAG AAA ATC ACA ATC GGT AGT GAT AAT CCT TTT ACA TTG GCA AAC TAC 2 638 

Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 

65 885 890 395 
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TGO CAT GAT ATT 7AA ACC CTG TTT AGC CAT GAC AAT GCC ACG TTA GAG ".--J 

Trp His Asp lis Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 

900 905 910 

5 TCC TTA CAA ACC GAC ACT TCT CTG GTA ATT GCT ACT CAG CAA CTT AGC 2734 

Ser Leu Gin Thr Asp Thr S r Leu Val He Ala Thr Gin Gin Leu Ser 

915 920 925 

CAG CTA GTG TTA ATT GTG AAA TGG CTG AGC CTG ACC GAG CAG GAT CTG 23 3 2 

10 S In Leu Val Leu He Val Lys Trp Leu ser Leu Thr Glu Gin Asp Leu 

930 935 940 

CAA TTA CTG AC A ACC TAT CCC GAA CGT TTA ATC AAC GGC ATC ACG AAT 2830 

Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 

15 345 950 955 960 

GTT CCT GTA CCC AAT CCG GAG CTA TTA CTC ACG CTA TCA CGT TTT AAG 2 32 3 

Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 
965 970 975 

20 

CAG TGG GAA ACT CAA GTC ACC GTT TCC CGT GAT GAA GCG ATG CGC TCT 29^6 

Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Met Arg Cys 

980 985 990 

25 TTC GAT CAA TTA AAT GCC AAT GAT ATG ACG ACT GAA AAT GCA GGT TCA 3 024 

Phe Asp Gin Leu Asn Ala Asn Asp Met Thr Thr Glu Asn Ala Gly Ser 

995 1000 1005 

CTG ATC GCC AC A TTG TAT GAG ATG GAT AAA GGT ACG GGA GCG CAA GTT 307 2 

30 Leu He Ala Thr Leu Tyr Glu Met Asp Lys Gly Thr Gly Ala Gin Val 

1010 1015 1020 

AAT ACC TTG CTA TTA GGT GAA AAT AAC TGG CCG AAA AGT TTT ACC TCT 3120 

Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 

35 1025 1030 1035 1040 

CTC TGG CAA CTT CTG ACC TGG TTA CGC GTC GGG CAA AG A CTG AAT GTC 3163 

Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 
1045 1050 1055 

40 

GGT AGT ACC ACT CTG GGC AAT CTG TTG TCC ATG ATG CAA GCA GAC CCT 3 216 

Gly Ser Thr Thr Leu Gly Asn Leu Leu Ser Met Met Gin Ala Asp Pro 

1060 1065 1070 

45 GCT GCC GAG AGT AGC GCT TTA TTG GCA TCA GTA GCC CAA AAC TTA AGT 3 264 

Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 

1075 1080 1085 

GCC GCA ATC AGC AAT CGT CAG TAA 3285 

50 Ala Ala lie Ser Asn Arg Gin ••• 

1090 1095 



(2) INFORMATION FOR SEQ ID NO: 34: 
55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 5 amino acids 

(B) TYPE: amino acids 

(C) TOPOLOGY: linear 



60 



(ii) MOLECULE TYPE: protein 



65 



(xi) SEQUENCE DESCRIPTION: 
Features From To 
254 267 
254 492 



SEQ ID NO: 34: 
Description 
SEQ ID NO: 15 
TcaAii peptide 
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20 



35 



50 



65 



Met Val Thr Val Met Gin Asn Lys He Ser Phe Leu Ser Gly Thr Ser 
15 10 15 

Glu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe Asp IU Ala 

20 25 30 

Ser lie Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 
35 40 45 

Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 
50 55 60 

Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val lie 
65 70 75 80 

Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 
85 90 95 

Asp Ala Leu Val Glu Asn lie Gly Gly Asp Gly Asp Phe Ser Asp Leu 
100 105 110 



Mec Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser He Gin Ser Leu 

25 115 120 125 

Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 

130 135 140 

30 Leu His Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg Ala Asp 

145 150 155 160 



Leu Lys Asp Leu lie Leu Ser Glu Thr Thr Met Asn Lys Glu Val Thr 
165 170 175 

Ser Leu Asp lie Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp lie 
180 185 190 



Thr Glu Leu Ser Gly Ala Phe Phe Pro Met Thr Leu Pro Tyr Asp Asp 
40 195 200 205 

His Leu Ser Gin He Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 
210 215 220 

45 Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 

225 230 235 240 



Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 
245 250 255 

Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 
260 265 270 



Val Gly Phe Ser Gly Gin Pro Met Val Tyr Leu Ser Gin Tyr Thr Ser 
55 275 280 285 

Gly Asn Gly He Val Gly Ala Gin Leu He Ala Gly Asn Pro Asp Gin 
290 295 300 

60 Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Met Ala 
305 310 315 320 



Lys Gin cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Met Gly Asp 
325 330 335 

Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 
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340 345 



350 



Pro Asp Lys Asp Gly lie Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 
355 360 365 

Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 
370 375 3S0 

Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr lie Thr 
10 390 395 400 

Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val lie Asn 
405 410 415 

15 Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 
420 425 430 

Gly Thr Pro Thr Asn Pro Asp Asp Val lie Pro Pro Ala lie Asn Asp 
435 440 445 

He Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 
450 455 460 



20 



35 



Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp He Thr Asn 
-5 465 470 475 430 

His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 
435 490 W4 » 495 

30 Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe cys Glu Lys 
500 505 510 

Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 
515 520 525 

Tyr Ser Gin Ser Ser lie Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 
530 535 540 

Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr 
545 550 555 560 

Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He 
565 570 575 

45 Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 
580 585 590 

Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 
595 600 605 

Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg Ser 
610 615 620 



50 



Val Pro Asp His His Asp Lys He Val Leu Asp Lys Pro Val Leu Glu 
55 °25 630 635 640 

Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr Gly Leu Asp Ala 

645 650 655 

n0 Asn Thr Phe Ala Thr Phe He Ser Ala Val Asn Pro Tyr Thr Pro Asp 
660 665 670 

Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 
675 680 635 

His Val He Ala Leu Gly Thr Glu Val Lys Tyr Ala Glu Asn Glu Gin 
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650 655 700 

Asp Glu Leu Ala Ala lie Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 
705 710 715 720 

Glu Leu Leu Arg lie Gly Arg Tyr cys Phe Gly Asn Ala Gly Ser Phe 
725 730 735 

Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala lie 
740 745 750 

Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu lie Leu Trp Arg 
755 760 765 

Leu Met Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 
770 775 780 

Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 

785 790 795 300 

Asp Trp Met Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Mec 
805 810 315 

Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Mec Phe Asn Phe 
820 825 330 

Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 
835 840 845 

Thr Mec Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 
850 855 860 

Gly Phe Gly lie Lys Ser Asn Val Mec Gly He Val Thr Phe Trp Leu 
865 870 875 380 

Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 
885 890 395 

Trp Kis Asp He Gin Thr Leu Phe Ser His Asp Asn Ala Thr Leu Glu 
900 905 910 

Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr Gin Gin Leu Ser 
915 920 925 

Gin Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 
930 935 940 

Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 
945 950 955 960 

Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 
965 970 975 

Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Mec Arg Cys 
980 985 990 

Phe Asp Gin Leu Asn Ala Asn Asp Mec Thr Thr Glu Asn Ala Gly Ser 
995 1000 1005 

Leu He Ala Thr Leu Tyr Glu Mec Asp Lys Gly Thr Gly Ala Gin Val 
1010 1015 1020 

Asn Thr Leu Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 
1025 1030 1035 1040 

Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Leu Asn Val 
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°^ 1050 



105 = 

Gly Ser Thr Thr_Leu Gly Asn Leu Leu Ser 

10" 



1060 ■ 1065 ~" ° ln Md ASp ?ro 



Ala Ala Glu Ser Ser Ala Leu Leu Al 



1575 ~" ^ ^ 0 Aia S « val Ala Gin Asn Leu Ser 

i0B0 1035 



Ala Ala He Ser Asn Arg Gin ... 
10 i090 i095 



(2) INFORMATION FOR SEQ ID NO: 3 5 
(i) SEQUENCE CHARACTERISTICS 

'J /XI r v.. . 



20 



LENGTH: 603 amino acids 
IB) TYPE: amino acid 
(C) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 5 



25 



30 



Pro Leu ser Thr Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr 

Phe Cys oi« Lys Thr Arg Leu Ser Phe Asn Gin Leu M ec Asp Leu Thr 

^ 30 

AU ,1„ cl„ ser Tyr s„ 01n s.r s.r „. Asp Au Lys M . AU Jer 

40 45 



Arg Tyr Val Arg Phe oly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr 
35 55 60 /r 



40 



45 



Gly Ala Ala Tyr Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly oin 
Tyr Leu Trp He G j„ Tnr Asp Gly Lys g ^ ^ ^ ^ ^ ^ 

Thr Val Val Ma Leu Ala Gly Arg Ala Glu Lys Leu Val Arg 2 ser 

105 110 

Ser Gin Thr Gly Leu Ser Phe Glu Glu Leu Asp Trp Leu lie Ala Asn 

120 125 



Ala Ser Arg Ser Val Pro Asp His His Asp Lys He Val Leu Asp Lys 
50 135 140 



55 



60 



Pro val L eu Glu Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr 

155 160 
Gly Leu Asp Ala Asn Thr Phe Ala Thr Phe lie Ser Ala Val Asn Pro 

Tyr Thr Pro Asp Gin Thr Pro ser Phe Tyr Glu Thr Ala Phe Arg Ser 

185 190 

AU Asp «y AS„ «i. v.l ... Jlj L u Gly T„r 0 ,„ v.l Ly, Ty, Al. 

200 205 

Clu Asn Glu Gin Asp Glu Leu Ala Ala lie Cys cys Lys Ala Leu Gly 



<S5 215 220 
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Val Thr J« r , iE Jlu L Liu lc3 ne ciy ^ ^ ^ ^ 

2j5 240 

5 Ala cly Arg Phe Thr Leu , sp clu Tyr Thr AU s . r Gln Leu ^ Arg 

i.50 255 

Ph. Cly AU ne Pro at. Leu Phe cly Leu Thr Phe A la oln Au clu 

2o5 270 

10 lie L.u Trp Arg L6U Met M , „ ly Lys ^ n# ^ ^ o 



XXX Cly cln Aia Lys ser flln au ue ^ u ^ 

15 300 
Glu Gin Val Leu Asp Trp Met Ser Pm v^i s ^ , 

305 3 1 q * er Pro Val Asn Leu ser Leu Thr Tyr 

315 320 

2Q Leu cln cly MeC v.l Ser Thr Cln Trp Ser Cly Thr Ala Thr Ala clu 

330 J35 

*« ^ A S „ Phe L*u clu A S „ v.l Cys Asp ... V al As „ ser „„ AU 

i45 350 
25 xxx Thr Lys 01u Thr Met Asp AJi Leu Qjn w l ^ 

360 365 



Ala Leu Ser Ala Cly Ph e cly Iie Lys Ser Asn Val Mec Cly IU V al 
30 375 380 

Thr Phe Trp Leu clu Lys He Thr He riv » M . 

385 inr Iie Ar 9 Asp Asn Pro Phe Thr 

395 400 

3J Leu AU Asn Tyr Trp „ ls As p n . 01n h ^ ph6 ^ ^ ^ 

410 415 

Ala Thr Leu clu Ser Leu Cln Thr Asp Thr Ser Leu Val Ue Ala Thr 

425 430 
40 „l„ Oln Leu ser oln Leu Val Leu He v.l Ly S Trp v.l Ser L.u Thr 

440 445 

Clu Cln Asp Leu Cln Leu Leu Thr Thr Tyr Pro Clu Ar g Leu lie Asn 
45 455 460 

JlV He Thr Asn Val Pro Val Pro Asn Pro Clu Leu Leu Leu Thr Leu 

475 480 
50 ser Arg Phe Lys cln Trp Clu Thr cln Val Thr v.l Ser Arg Asp Clu 

490 4g5 
Ala hoc Ar g cys Phe Asp oln Leu Ul A£ „ Asp ^ 

505 510 
55 Asn Ala Cly ser Leu lie Ala Thr Leu Tyr Clu Mec Asp Lys cly Thr 

520 525 

Oly Ala Cln val Asn Thr Leu Leu Leu Cly clu Asn Asn Trp Pro Lys 
n<) 535 540 Y 



65 



ser Phe Thr Ser Leu Trp cln Leu Leu Thr Trp Leu Arg Val Cly cln 
Arg Leu Asn Val cly Ser Thr Thr Leu cly Asn Leu Leu Ser Me C 



570 5?5 
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Gin Ala Asp Pro Aid Ala Glu 3er Ser Ala L«u Leu Ala Ser Vai Aia 
530 535 590 

Gin Asn Leu Ser Ala Ala He Ser Asn Arg Gin * 
5 595 600 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 2557 base pairs 

(B> TYPE: nucleic acid 
(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 





GAATTCGGCT 


TGCGTTTAAT 


ATTGATGATG 


TCTCGCTCTT 


CCGCCTGCTT 


AAAATTACCG 


60 


20 


ACCATGATAA 


TAAAGATGGA 


AAAATTAAAA 


ATAACCTAAA 


GAATCTTTCC 


AATTTATATA 


120 




TTGGAAAATT 


ACTGGCAGAT 


ATTCATCAAT 


TAACCATTGA 


TGAACTGGAT 


TTATTACTGA 


130 




TTGCCGTAGG 


TGAAGGAAAA 


ACTAATTTAT 


CCGCTATCAG 


TGATAAGCAA 


TTGGCTACCC 


240 




TGATCAGAAA 


ACTCAATACT 


ATTACCAGCT 


GGCTACATAC 


ACAGAAGTGG 


AGTGTATTCC 


300 




AGCTATTTAT 


CATGACCTCC 


ACCAGCTATA 


ACAAAACGCT 


AACGCCTGAA 


ATTAAGAATT 


360 


25 


TGCTGGATAC 


CGTCTACCAC 


GGTTTACAAG 


GTTTTGATAA 


AGACAAAGCA 


GATTTGCTAC 


420 




ATGTCATGGC 


GCCCTATATT 


GCGGCCACCT 


TGCAATTATC 


ATCGGAAAAT 


GTCGCCCACT 


480 




CGGfACTCCT 


TTGGGC AG AT 


AAGTTACAGC 


CCGGCGACGG 


CGCAATGACA 


GCAGAGGGAN 


540 




TCTGGGACTG 


GTTGAATACT 


AAGTATACGC 


CGGGTTCATC 


GG AAGCCGT A 


GAAACGCAGG 


600 




AACATATCGT 


TCAGTATTGT 


CAGGCTCTGG 


CACAATTGGA 


AATGGTTTAC 


CATTCCACCG 


660 


30 


GCATCAACGA 


AAACGCCTTC 


CGTCTATTTG 


TGAC AAAACC 


AGAGATGTTT 


GGCGCTGCAA 


720 




CTGGAGCAGC 


GCCCGCGCAT 


GATGCCCTTT 


CACTGATTAT 


GCTGACACGT 


TTTGCGGATT 


730 




GGGTGAACGC 


ACTAGGCGAA 


AAAGCGTCCT 


CGGTGCTAGC 


GGCATTTGAA 


GCTAACTCGT 


840 




TAACGGCAGA 


ACAACTGGCT 


GATGCCATGA 


ATCTTGATGC 


TAATTTGCTG 


TTGCAAGCCA 


900 




GTATTCAAGC 


ACAAAATCAT 


CAACATCTTC 


CCCCAGTAAC 


TCCAGAAAAT 


GCGTTCTCCT 


960 


35 


GTTGGACATC 


TATCAATACT 


ATCCTGCAAT 


GGGTTAATGT 


CGCACAACAA 


TTGAAATGTC 


1020 




GCCCCACAGG 


GCGTTTCCGC 


TTTGGTCGGG 


CTGGATTATA 


TTCAATCAAT 


GAAAGAGACA 


1080 




CCGACCTATG 


CCCAGTGGGA 


AAACGCGGCA 


GGCGTATTAA 


CCGCCGGGTT 


GAATTCAACA 


1140 




ACAGGCTAAT 


ACATTACAAC 


GCTTTTCTGG 


ATGAATCTCG 


CAGTGCCGCA 


TTAAGCACCT 


1200 




ACTATATCCG 


TCAAGTCGCC 


AAGGCAGCGG 


CGGCTATTAA 


AAGCCGTGAT 


GACTTGTATC 


1260 


40 


AATACTTACT 


GATTGATAAT 


CAGGTTTCTG 


CGGCAATAAA 


AACCACCCGG 


ATCGCCGAAG 


1320 




CCATTGCCAG 


TATTCAACTG 


TACGTCAACC 


GGGCATTGGA 


AAATGTGGAA 


GAAAATGCCA 


1380 




ATTCGGGGGT 


TATCAGCCGC 


CAATTCTTTA 


TCGACTGGGA 


CAAATACAAT 


AAACGCTACA 


1440 




GCACTTGGGC 


GGGTGTTTCT 


CAATTAGTTT 


ACTACCCGGA 


AAACTATATT 


GATCCGACCA 


1500 




TGCGTATCGG 


ACAAACCAAA 


ATGATGGACG 


CATTACTGCA 


ATCCGTCAGC 


CAAAGCCAAT 


1560 


45 


TAAACGCCGA 


TACCGTCGAA 


GATGCCTTTA 


TGTCTTATCT 


GACATCGTTT 


GAACAAGTGG 


1620 




CTAATCTTAA 


AGTTATTAGC 


GCATATCACG 


ATAATATTAA 


TAACGATCAA 


GGGCTGACCT 


1630 




ATTTTATCGG 


ACTCAGTGAA 


ACTGATGCCG 


GTGAATATTA 


TTGGCGCAGT 


GTCGATCACA 


1740 




GTAAATTCAA 


CGACGGTAAA 


TTCGCGGCTA 


ATGCCTGGAG 


TGAATGGCAT 


AAAATTGATT 


1300 




GTCCAATTAA 


CCCTTATAAA 


AGCACTATCC 


GTCCAGTGAT 


ATATAAATCC 


CGCCTGTATC 


i860 
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10 



25 



40 



55 



TCCTCTGGTT 


GGAACAAAAG 


GAGATCACCA 


AACAGACAGG 


AAATAGTAAA 


GATGGCTATC 


I S 2 3 


AAACTGAAAC 


GGATTATCGT 


TATGAACTAA 


AATTGGCGCA 


TATCCGCTAT 


GATGGCACTT 


1-330 


GCAATACGCC 


AATCACCTTT 


GATGTCAATA 


AAAAAATATC 


CGAGCTAAAA 


CTGGAAAAAA 


2040 


ATAGAGCGCC 


CGGACTCTAT 


TGTGCCGGTT 


ATCAAGGTGA 


AGATACGTTG 


CTGGTGATGT 


2100 


TTTATAACCA 


ACAAGACACA 


CTAGATAGTT 


ATAAAAACGC 


TTCAATGCAA 


GGACTATATA 


2160 


TCTTTGCTGA 


TATGGCATCC 


AAAGATATGA 


CCCCAGAACA 


GAGCAATGTT 


TATCGGGATA 


2220 


ATAGCTATCA 


ACAATTTGAT 


ACCAATAATG 


TCAGAAGAGT 


GAATAACCGC 


TATGCAGAGG 


2280 


ATTATGAGAT 


TCCTTCTTCG 


GTAAGTAGCC 


GTAAAGACTA 


TGGTTGGGGA 


G ATT ATT AC C 


2340 


TCAGCATGGT 


ATATAACGGA 


GATATTCCAA 


CTATCAATTA 


CAAAGCCGCA 


TCAAGTGATT 


2400 


TAAAAATTTA 


TATTTCACCA 


AAATTAAGAA 


TTATTCATAA 


TGGATATGAA 


GGACAGAAGC 


2460 


GCAATCAATG 


CAATTTGATG 


AATAAATATG 


GCAAACTAGG 


TGATAAATTT 


ATTGTGTATA 


2520 


CCAGCCTGGG 


CGTTAATCCG 


AATAATAAGC 


CGAATTC 






2557 



15 (2) INFORMATION FOR SEQ ID NO: 37; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 5 amino acids 

(B) TYPE: amino acids 

(C) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE : protein (partial) 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Ala Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys He Thr 
15 10 15 

Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn Leu 
30 20 25 30 

Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu Thr 
35 40 45 

35 He Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys Thr 
50 55 60 



Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg Lys 

65 70 75 80 

Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe 
85 90 95 



Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr Pro 
45 100 105 110 

Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe 
115 120 125 

50 Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He Ala 
130 135 140 



Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu 

145 150 155 160 

Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Mec Thr Ala Glu Gly 

165 170 175 

Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala 
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135 



ISO 



Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala Gin 
195 200 205 

5 

Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe Arg 
210 215 220 

Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala Ma 
10 225 230 235 240 

Pro Ala His Asp Ala Leu Ser Leu lie Met Leu Thr Arg Phe Ala Asp 
245 250 " 255 

15 Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala Phe 
260 265 270 



20 



Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn Leu 
275 280 285 

Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His Gin 
290 295 300 



His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser 
25 305 310 315 320 

lie Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Lys Cys 
325 330 335 

30 Arg Pro Thr Gly Arg Phe Arg Phe Gly Arg Ala Gly Leu Tyr Ser lie 
340 345 350 



35 



Asn Glu Arg Asp Thr Asp Leu Cys Pro Val Gly Lys Arg Gly Arg Arg 

355 360 365 

He Asn Arg Arg Val Glu Phe Asn Asn Arg Leu He His Tyr Asn Ala 

370 375 380 



Phe Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg 

40 385 390 395 * 400 

Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu Tyr 

405 410 415 

45 Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr 

420 425 430 



50 



Arg lie Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arg Ala 

435 440 445 

Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin 

450 455 460 



55 



Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 
465 470 475 480 



Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr lie Asp Pro Thr 

485 490 495 

60 Met Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 

500 505 510 



65 



Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Mec Ser 
515 520 525 

Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 
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530 535 540 

Tyr His Asp Asn lie Asn Asn Asp Gin Gly Leu Thr Tyr Phe lie Gly 
545 550 555 560 

5 

Lau Ser Glu Thr Asp Ala Gly Glu Tyr T/r Trp Arg Ser Val Asp His 
565 570 575 

Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 
10 580 585 590 

His Lys lie Asp Cys Pro lie Asn Pro Tyr Lys Ser Thr He Arg Pro 
595 600 605 

15 Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 
610 615 620 



20 



35 



50 



60 



lie Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 

625 630 635 640 

Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His lie Arg Tyr Asp Gly Thr 

645 650 655 



Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys lie Ser Glu Leu 

25 660 665 670 

Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 

675 680 685 

30 Gly Glu Asp Thr Leu Leu Val Mec Phe Tyr Asn Gin Gin Asp Thr Leu 

690 695 700 



Asp Ser Tyr Lys Asn Ala Ser Mec Gin Gly Leu Tyr lie Phe Ala Asp 
705 710 715 720 

Mec Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 
725 730 735 



Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 
40 740 745 750 

Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 
755 760 765 

45 Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Mec Val Tyr Asn Gly Asp 
770 775 780 



He Pro Thr He Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 

785 790 795 800 

He Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 

805 810 815 



Arg Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 
55 820 825 830 



Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn 
835 840 845 



(2) INFORMATION FOR SEQ ID NO: 38: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 
65 (B) TYPE: amino acid 
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40 



50 



iC) STRANDNESS : single 
(D) TOPOLOGY : linear 

(ii) MOLECULAR TYPE: procein 

(v) FRAGMENT TYPE: N- terminal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe lie Gly 
15 10 15 

Lys 

(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu lie Asn Thr Ala 
15 10 15 

35 He Ser Pro Ala Lys 

20 



(2! INFORMATION FOR SEQ ID NO: 40: 



(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: single 
45 (D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40; 



Ala Asn Ser Leu Tyr Ala Leu Phe Leu Pro Gin 
55 1 5 10 



(2) INFORMATION FOR SEQ ID NO: 41: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 
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10 



20 



30 



35 



40 



45 



50 



fiO 



'B> TYPE: amino acid 
(C: STRANDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 41: 

Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 
(B> TYPE: amino acid 

(C) STRANDNESS: Single 

(D) TOPOLOGY: linear 



(ii) MOLECULAR TYPE: protein 
25 < v > FRAGMENT TYPE: N-terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 
1 5 10 is 

Ala Gly Leu Glu 

(2) INFORMATION FOR SEQ ID NO: 43: 

fi) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 

(v) FRAGMENT TYPE: N-terminal 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

lie Arg Glu Asp Tyr Pro Ala Ser Leu Gly Lys 
1 5 10 



55 (2) INFORMATION FOR SEQ ID NO: 44 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDNESS : single 

(D) TOPOLOGY: linear 
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(ii! MOLECULAR TYPE : protein 

(v) FRAGMENT TYPE: N-terminal 

5 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 44: 

Asp Asp Ser Gly Asp Asp Asp Lys Val Thr Asn Thr Asp lie His 
10 15 10 15 

Arg 

15 (2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
20 (C) STRANDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: protein 

25 (v) FRAGMENT TYPE: N-terminal 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

30 Asp Val Xaa Gly Ser Glu Lys Ala Asn Glu Lys Leu Lys 
1 5 10 ■ 



(2) INFORMATION FOR SEQ ID NO:46: 
35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7551 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : 1 inear 

40 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46 iCCdA): 

ATG AAC GAG TCT GTA AAA GAG ATA CCT GAT GTA TTA AAA AGC CAG TGT 48 
Met Asn Glu Ser Val Lys Glu lie Pro Asp Val Leu Lys Ser Gin Cys 
15 10 15 

50 GGT TTT AAT TGT CTG ACA GAT ATT AGC CAC AGC TCT TTT AAT GAA TTT So 
Gly Phe Asn Cys Leu Thr Asp lie Ser His Ser Ser Phe Asn Glu Phe 
20 25 30 

CGC CAG CAA GTA TCT GAG CAC CTC TCC TGG TCC GAA ACA CAC GAC TTA 1-14 
55 Arg Gin Gin Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 
35 40 45 

TAT CAT GAT GCA CAA CAG GCA CAA AAG GAT AAT CGC CTG TAT GAA GCG 19 2 
Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu Ala 
60 50 55 60 

CGT ATT CTC AAA CGC GCC AAT CCC CAA TTA CAA AAT GCG GTG CAT CTT 24 0 
Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val His Leu 

65 70 75 80 



45 



65 
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GCC ATT CTC CCT CCC AAT GCT CAA CTG ATA Gee TAT AAC AAT CAA TTT 268 
-la lie Leu Ala Pre Asn .Ala Glu Leu He Glv T/r Asn Asn Gin Ph« 
85 30 55 

5 AGC GGT AGA GCC ACT CAA TAT CTT GCG CCC GGT ACC GTT TCT TCC ATG 33 6 
Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser Ser Met 
100 105 110 

TTC TCC CCC GCC GCT TAT TTG ACT GAA CTT TAT CGT GAA GCA CGC AAT 3 84 
10 Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg A.sn 
115 120 125 

TTA CAC GCA ACT GAC TCC GTT TAT TAT CTG GAT ACC CGC CGC CCA GAT 4 32 
Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 
15 130 135 140 

CTC AAA TCA ATG GCG CTC AGT CAG CAA AAT ATG GAT ATA GAA TTA TCC 4 30 
Leu Lys Ser Met Ala Leu Ser Gin Gin Asn Met Asp He Glu Leu Ser 
145 150 155 160 



20 



40 



60 



AC A CTC TCT TTG TCC AAT GAG CTG TTA TTG GAA AGC ATT AAA ACT GAA 528 
Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser He Lys Thr Glu 
165 170 175 



25 TCT AAA CTG GAA AAC TAT ACT AAA GTG ATG GAA ATG CTC TCC ACT TTC 57 6 
Ser Lys Leu Glu Asn Tyr Thr Lys Val Met Glu Met Leu Ser Thr Phe 
180 185 190 

CGT CCT TCC GGC GCA ACG CCT TAT CAT GAT GCT TAT GAA AAT GTG CGT 62 4 
30 Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 
195 200 205 

GAA GTT ATC CAG CTA CAA GAT CCT GGA CTT GAG CAA CTC AAT GCA TCA 67 2 
Glu Val He Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 
35 210 215 220 

CCG GCA ATT GCC GCG TTG ATG CAT CAA GCC TCC CTA TTG GGT ATT AAC 720 
Pro Ala He Ala Gly Leu Met His Gin Ala Ser Leu Leu Gly He Asn 
225 230 235 240 



GCT TCA ATC TCG CCT GAG CTA TTT AAT ATT CTG ACG GAG GAG ATT ACC 7 68 
Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr 
245 250 255 



45 GAA GGT AAT GCT GAG GAA CTT TAT AAG AAA AAT TTT GGT AAT ATC GAA 316 
Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 
260 265 270 

CCG GCC TCA TTG GCT ATG CCG GAA TAC CTT AAA CGT TAT TAT AAT TTA 3 64 
50 Pro Ala Ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 
275 280 285 

AGC GAT GAA GAA CTT AGT CAG TTT ATT GGT AAA GCC AGC AAT TTT GGT 912 
Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 
55 290 295 300 

CAA CAG GAA TAT AGT AAT AAC CAA CTT ATT ACT CCG GTA GTC AAC AGC 9 60 
Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 
305 310 315 320 



AGT GAT GGC ACG GTT AAG GTA TAT CGG ATC ACC CGC GAA TAT AC A ACC 1003 
Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 
325 330 335 



65 AAT GCT TAT CAA ATG GAT GTG GAG CTA TTT CCC TTC GGT GGT GAG AAT 1056 
Asn Ala Tyr Gin Met Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 
340 345 350 

TAT CGG TTA GAT TAT AAA TTC AAA AAT TTT TAT AAT GCC TCT TAT TTA 1104 
70 Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 
355 360 365 

-186- 



SUBSTITUTE SHEET (RULE 26) 



WO 97/17432 PCT/US96/18003 



TCC ATC AAG TTA AAT GAT AAA AGA GAA CTT CTT CGA ACT GAA GGC GCT 1152 

Ser I la Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala 
370 375 330 

5 

CCT CAA GTC AAT ATA GAA TAC TCC GCA AAT ATC AC A TTA AAT ACC GCT 1200 

Pro Gin Val Asn lie Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ma 
335 390 395 40 o 

H> GAT ATC ACT CAA CCT TTT GAA ATT GGC CTG AC A CGA GTA CTT CCT TCC 124 3 
Asp lie Ser Gin Pro Phe Glu lis Gly Leu Thr Arg Val Leu Pro Ser 
405 410 

GGT TCT TGG GCA TAT GCC GCC GCA AAA TTT ACC GTT GAA GAG TAT AAC 129 6 
15 Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 
420 425 430 

CAA TAC TCT TTT CTG CTA AAA CTT AAC AAG GCT ATT CGT CTA TCA CGT 1344 
Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala He Arg Leu Ser Arg 
20 435 440 445 



25 



45 



65 



GCG ACA GAA TTG TCA CCC ACG ATT CTG CAA GGC ATT CTG CGC ACT GTT 13 9 2 

Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg Ser Val 
450 455 460 

AAT CTA CAA CTG GAT ATC AAC ACA GAC GTA TTA GGT AAA GTT TTT CTG 144 0 

Asn Leu Gin Leu Asp lie Asn Thr Asp Val Leu Gly Lys Val Phe Leu 
465 470 475 430 



30 ACT AAA TAT TAT ATC CAG CGT TAT GCT ATT CAT GCT GAA ACT GCC CTG 1488 

Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 
485 490 495 

ATA CTA TCC AAC GCG CCT ATT TCA CAA CGT TCA TAT GAT AAT CAA CCT 153 6 

35 He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn Gin Pro 
500 505 510 

AGC CAA TTT GAT CGC CTG TTT AAT ACG CCA TTA CTG AAC GGA CAA TAT 1584 

Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 
40 515 520 525 

TTT TCT ACC GGC GAT GAG GAG ATT GAT TTA AAT TCA GGT AGC ACC GGC 1632 

Phe Ser Thr Gly Asp Glu Glu He Asp Leu Asn Ser Gly Ser Thr Gly 
530 535 540 



GAT TGG CGA AAA ACC ATA CTT AAG CGT GCA TTT AAT ATT GAT GAT GTC 1680 
Asp Trp Arg Lys Thr He Leu Lys Arg Ala Phe Asn He Asp Asp Val 
545 550 555 560 



50 TCG CTC TTC CGC CTG CTT AAA ATT ACC GAC CAT GAT AAT AAA GAT GGA 1728 
Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys Asp Gly 
565 570 575 

AAA ATT AAA AAT AAC CTA AAG AAT CTT TCC AAT TTA TAT ATT GGA AAA 1776 
55 Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr He Gly Lys 
580 585 590 

TTA CTG GCA GAT ATT CAT CAA TTA ACC ATT GAT GAA CTG GAT TTA TTA 13 24 
Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Asp Leu Leu 
60 595 600 605 

CTG ATT GCC GTA GGT GAA GGA AAA ACT AAT TTA TCC GCT ATC AGT GAT 1372 
Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala He Ser Asp 
610 615 620 



AAG CAA TTG GCT ACC CTG ATC AGA AAA CTC AAT ACT ATT ACC AGC TGG 1920 
Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 
625 630 635 640 



70 CTA CAT ACA CAG AAG TGG AGT GTA TTC CAG CTA TTT ATC ATG ACC TCC 19 63 
Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Mec Thr Ser 
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30 



50 



70 



645 650 6=5 

ACC ACC TAT AAC AAA ACG CTA ACG CCT GAA ATT AAG AAT TTG CTG GAT 2016 

Thr ser Tyr Asn Lys Thr Leu Thr Pro Glu lie Lys Asn Leu Leu Asp 

660 665 670 

ACC GTC TAC CAC GGT TTA CAA GGT TTT GAT AAA GAC AAA GCA GAT TTG 2064 

Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 
675 680 685 

CTA CAT GTC ATG GCG CCC TAT ATT GCG GCC ACC TTG CAA TTA TCA TCG 2112 

Leu His Val Met Ala Pro Tyr lie Ala Ala Thr Leu Gin Leu Ser Ser 
690 695 700 



15 GAA AAT GTC GCC CAC TCG GTA CTC CTT TGG GCA GAT AAG TTA CAG CCC 2160 
Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 
705 710 715 720 

GGC GAC GGC GCA ATG AC A GCA GAA AAA TTC TGG GAC TGG TTG AAT ACT 2 203 
20 Gly Asp Gly Ala Mec Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 

725 730 735 

AAG TAT ACG CCG GGT TCA TCG GAA GCC GTA GAA ACG CAG GAA CAT ATC 2256 
Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 
25 740 745 750 

GTT CAG TAT TGT CAG GCT CTG GCA CAA TTG GAA ATG CTT TAC CAT TCC 2304 
Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Met Val Tyr His ser 
755 760 765 



ACC GGC ATC AAC GAA AAC GCC TTC CGT CTA TTT GTG AC A AAA CCA GAG 2352 
Thr Gly lie Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu 
770 775 780 



35 ATG TTT GGC GCT GCA ACT GGA GCA GCG CCC GCG CAT GAT GCC CTT TCA 2400 
Mec Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser 
785 790 795 800 

CTG ATT ATG CTG AC A CGT TTT GCG GAT TGG GTG AAC GCA CTA GGC GAA 2448 
40 Leu He Met Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 

805 810 815 

AAA GCG TCC TCG GTG CTA GCG GCA TTT GAA GCT AAC TCG TTA ACG GCA 2496 
Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ala 
45 820 825 830 

GAA CAA CTG GCT GAT GCC ATG AAT CTT GAT GCT AAT TTG CTG TTG CAA 2544 
Glu Gin Leu Ala Asp Ala Mec Asn Leu Asp Ala Asn Leu Leu Leu Gin 
335 840 845 



GCC AGT ATT CAA GCA CAA AAT CAT CAA CAT CTT CCC CCA GTA ACT CCA 2592 
Ala Ser lie Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 
850 855 860 



55 GAA AAT GCG TTC TCC TGT TGG ACA TCT ATC AAT ACT ATC CTG CAA TGG 2640 
Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu Gin Trp 
365 870 875 880 

GTT AAT GTC GCA CAA CAA TTG AAT GTC GCC CCA CAG GGC GTT TCC GCT 268 8 
60 Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 

885 890 895 

TTG GTC GGG CTG GAT TAT ATT CAA TCA ATG AAA GAG ACA CCG ACC TAT 27 3 6 
Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 
65 900 905 910 

GCC CAG TGG GAA AAC GCG GCA GGC GTA TTA ACC GCC GGG TTG AAT TCA 27 a 4 
Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 
915 920 925 



CAA CAG GCT AAT ACA TTA CAC GCT TTT CTG GAT GAA TCT CGC AGT GCC 233 2 
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Gin Gin Ala Asn Thr L«u His Aid Phe Leu Asp Glu Ser Arg Ser Aia 
930 935 340 

GCA TTA AGC ACC TAC TAT ATC CGT CAA GTC GCC AAG GCA GCG GCG GCT 0 
5 Ala Leu Ser Thr Tyr Tyr lie Arg Gin Val Ala Lys Ala Ala Ala Ala 
945 950 955 960 

ATT AAA AGC CGT GAT GAC TTG TAT CAA TAC TTA CTG ATT GAT AAT CAG 2923 
He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin 
10 965 970 975 

GTT TCT GCG GCA ATA AAA ACC ACC CGG ATC GCC GAA GCC ATT GCC ACT 297 6 
Val Ser Ala Ala He Lys Thr Thr Arg lie Ala Glu Ala He Ala Ser 
980 985 990 



15 



35 



55 



ATT CAA CTG TAC GTC AAC CGG GCA TTG GAA AAT GTG GAA GAA AAT GCC 3 02 4 
He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 
995 1000 1005 



20 AAT TCG GCG GTT ATC AGC CGC CAA TTC TTT ATC GAC TGG GAC AAA TAC 3 07 2 

Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 
1010 1015 1020 

AAT AAA CGC TAC AGC ACT TGG GCG GGT GTT TCT CAA TTA GTT TAC TAC 3120 

25 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 
1025 1030 1035 L040 

CCG GAA AAC TAT ATT GAT CCG ACC ATG CGT ATC GGA CAA ACC AAA ATC 3163 

Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr Lys Met 
30 1045 1050 1055 

ATG GAC GCA TTA CTG CAA TCC GTC AGC CAA AGC CAA TTA AAC GCC GAT 3216 

Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 
1060 1065 1070 



ACC GTC GAA GAT GCC TTT ATG TCT TAT CTG ACA TCG TTT GAA CAA GTG 3264 
Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 
1075 1080 1085 



40 GCT AAT CTT AAA GTT ATT AGC GCA TAT CAC GAT AAT ATT AAT AAC GAT 3 312 
Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn He Asn Asn Asp 
1090 1095 1100 

CAA GGG CTG ACC TAT TTT ATC GGA CTC AGT GAA ACT GAT GCC GGT GAA 3 36 0 
45 Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly Glu 
1105 1110 1115 1120 

TAT TAT TGG CGC AGT GTC GAT CAC AGT AAA TTC AAC GAC GGT AAA TTC 3 408 
Tyr Tyr Trp Arg Ser Val Asp His ser Lys Phe Asn Asp Gly Lys Phe 
50 1125 1130 H35 

GCG GCT AAT GCC TGG AGT GAA TGG CAT AAA ATT GAT TGT CCA ATT AAC 3 456 
Ala Ala Asn Ala Trp Ser Glu Trp His Lys He Asp cys Pro He Asn 
1140 1145 1150 



CCT TAT AAA AGC ACT ATC CGT CCA GTG ATA TAT AAA TCC CGC CTG TAT 3 504 
Pro Tyr Lys Ser Thr He Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 
1155 1160 1165 



60 CTG CTC TGG TTG GAA CAA AAG GAG ATC ACC AAA CAG ACA GGA AAT AGT 3552 
Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly Asn Ser 
1170 1175 1180 

AAA GAT GGC TAT CAA ACT GAA ACG GAT TAT CGT TAT GAA CTA AAA TTG 3 6 00 
65 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 
1135 1190 1195 1200 

GCG CAT ATC CGC TAT GAT GGC ACT TGG AAT ACG CCA ATC ACC TTT GAT 3 648 
Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp 
70 1205 1210 1215 
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40 



60 



GTC AAT AAA AAA ATA TCC GAG CTA AAA CTG GAA AAA AAT AGA GCG CCC 16} 6 
Val Asn Lys Lys lie Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 
1220 1225 1230 

GGA CTC TAT TGT GCC GGT TAT CAA GGT GAA GAT ACG TTG CTG GTC ATG 37J4 
Glv Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 
1235 1240 1245 

TTT TAT AAC CAA CAA GAC AC A CTA GAT AGT TAT AAA AAC GCT TCA ATG 3 ~92 
Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Aid Ser Mec 
1250 1255 1260 

CAA GGA CTA TAT ATC TTT GCT GAT ATG GCA TCC AAA GAT ATG ACC CCA 3 34 0 
Gin Gly Leu Tyr lie Phe Ala Asp Mec Ala Ser Lys Asp Mec Thr Pro 
1265 1270 1275 1230 

GAA CAG AGC AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 3 833 
Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 
1285 1290 1295 

AAT AAT GTC AGA AGA GTG AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3 93 6 
Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu lie 
1300 1305 1310 



25 CCT TCC TCG GTA AGT AGC CGT AAA GAC TAT GGT TGG GGA GAT TAT TAC 3 984 
Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 
1315 1320 1325 

CTC AGC ATG GTA TAT AAC GGA GAT ATT CCA ACT ATC AAT TAC AAA GCC 4032 
30 Leu Ser Mec Val Tyr Asn Gly Asp lie Pro Thr lie Asn Tyr Lys Ala 
1330 1335 1340 

GCA TCA AGT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 4 08 0 
Ala Ser Ser Asp Leu Lys lie Tyr lie ser Pro Lys Leu Arg He lie 
35 1345 1350 1355 1360 

CAT AAT GGA TAT GAA GGA CAG AAG CGC AAT CAA TGC AAT CTG ATG AAT 4128 
His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Mec Asn 
1365 1370 1375 



AAA TAT GGC AAA CTA GGT GAT AAA TTT ATT GTT TAT ACT AGC TTG GGG 417 6 
Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 
1380 1385 1390 



45 GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATG TTT TAC CCC GTC TAT 4224 
Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Mec Phe Tyr Pro Val Tyr 
1395 1400 1405 

CAA TAT AGC GGA AAC ACC AGT GGA CTC AAT CAA GGG AGA CTA CTA TTC 427 2 
50 Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 

CAC CGT GAC ACC ACT TAT CCA TCT AAA GTA GAA GCT TGG ATT CCT GGA 43 20 
His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp lie Pro Gly 
55 1425 1430 1435 1440 

GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT GAT TAT 4 3 68 
Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 
1445 1450 1455 



GCT ACA GAC TCT CTG AAT AAA CCG GAT GAT CTT AAG CAA TAT ATC TTT 4416 
Ala _'hr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 
1460 1465 1470 



65 ATG ACT GAC AGT AAA GGG ACT GCT ACT GAT GTC TCA GGC CCA GTA GAG 44 6 4 
Mec Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 
1475 1480 1485 

ATT AAT ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA GTC AAA GCG 4512 
70 lie Asn Thr Ala lie Ser Pro Ala Lys Val Gin He He Val Lys Ala 
1490 1495 1500 
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GGT GGC AAG GAG CAA ACT TTT ACC GCA GAT AAA GAT GTC TCC ATT CAG 4 56 0 

Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 
1505 1510 1515 1520 

5 

CCA TCA CCT AGC TTT GAT GAA ATG AAT TAT CAA TTT AAT GCC CTT GAA 46 03 

Pro ser Pro Ser Phe Asp Glu Mec Asn Tyr Gin Phe Asn Aid Leu Glu 
1525 1530 1535 

10 ATA GAC GGT TCT GGT CTG AAT TTT ATT AAC AAC TCA GCC AGT ATT GAT 4 65c 
lie Asp Gly Ser Gly Leu Asn Phe He Asn Asn Ser Ala Ser He Asp 
1540 1545 1550 

CTT ACT TTT ACC GCA TTT GCG GAG GAT GGC CGC AAA CTG GGT TAT GAA 4 7 04 
15 Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 
1555 1560 1565 

AGT TTC AGT ATT CCT GTT ACC CTC AAG GTA AGT ACC GAT AAT GCC CTG 4752 
Ser Phe Ser lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 
20 1570 1575 1580 

ACC CTG CAC CAT AAT GAA AAT GGT GCG CAA TAT ATG CAA TGG CAA TCC 4300 
Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Met Gin Trp Gin Ser 
1535 1590 1595 1600 



25 



45 



65 



TAT CGT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT GCA CGC 4 3 43 
Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 
1605 1610 1615 



30 GCC ACC ACC GGA ATC GAT AC A ATT CTG AGT ATG GAA ACT CAG AAT ATT 4896 
Ala Thr Thr Gly He Asp Thr lie Leu Ser Met Glu Thr Gin Asn He 
1620 1625 1630 

CAG GAA CCG CAG TTA GGC AAA GGT TTC TAT GCT ACG TTC GTC ATA CCT 494 4 
35 Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 
1635 1640 1645 

CCC TAT AAC CTA TCA ACT CAT GGT GAT GAA CGT TGG TTT AAG CTT TAT 4992 
Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 
40 1650 1655 1660 

ATC AAA CAT GTT GTT GAT AAT AAT TCA CAT ATT ATC TAT TCA GGC CAG 5040 
He Lys His Val Val Asp Asn Asn Ser His lie lie Tyr Ser Gly Gin 
1665 1670 1675 1680 



CTA AC A GAT ACA AAT ATA AAC ATC ACA TTA TTT ATT CCT CTT GAT GAT 5083 
Leu Thr Asp Thr Asn He Asn He Thr Leu Phe lie Pro Leu Asp Asp 
1685 1690 1695 



50 GTC CCA TTG AAT CAA GAT TAT CAC GCC AAG GTT TAT ATG ACC TTC AAG 5136 
Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Met Thr Phe Lys 
1700 1705 1710 

AAA TCA CCA TCA GAT GGT ACC TGG TGG GGC CCT CAC TTT GTT AGA GAT 5134 
55 Lys Ser Pro ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 
1715 1720 1725 

GAT AAA GGA ATA GTA ACA ATA AAC CCT AAA TCC ATT TTG ACC CAT TTT 523 2 

Asp Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr His Phe 
60 1730 1735 1740 

GAG AGC GTC AAT GTC CTG AAT AAT ATT AGT AGC GAA CCA ATG GAT TTC 5230 
Glu Ser Val Asn Val Leu Asn Asn lie Ser Ser Glu Pro Mec Asp Phe 
1745 1750 1755 1760 



AGC CGC GCT AAC AGC CTC TAT TTC TGG GAA CTG TTC TAC TAT ACC CCG 53 23 
Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 
1765 1770 1775 



70 ATG CTG GTT GCT CAA CGT TTG CTG CAT GAA CAG AAC TTC GAT GAA GCC 537i 
Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 
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1730 1785 17SC 

AAC CGT TGG CTG AAA TAT GTC TGG ACT CCA TCC GGT TAT ATT GTC CAC "5 4 2-1 
Asn Arg Trp Leu Lys Tyr Val Trp S r Pro Ser Gly Tyr lie Val His 
5 1795 1300 1805 

GCC CAG ATT CAG AAC TAC CAG TGG AAC GTC CGC CCG TTA CTG GAA GAC 547 2 

31y Gin lie Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 

1310 1815 1820 

10 

ACC AGT TGG AAC ACT GAT CCT TTG GAT TCC GTC GAT CCT GAC GCG GTA 5520 

Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 

1325 1830 1835 1840 

15 GCA CAG CAC GAT CCA ATG CAC TAC AAA GTT TCA ACT TTT ATG CGT ACC 5568 
Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 
1845 1850 1855 

TTG GAT CTA TTG ATA GCA CGC GGC GAC CAT GCT TAT CGC CAA CTG GAA 5616 
20 Leu Asp Leu Leu lie Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 
1360 1865 1870 

CGA GAT AC A CTC AAC GAA GCG AAG ATG TGG TAT ATG CAA GCG CTG CAT 5664 
Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 
25 1875 1880 1885 

CTA TTA GGT GAC AAA CCT TAT CTA CCG CTG AGT ACG ACA TGG AGT GAT 5712 
Leu Leu Gly Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 
1390 1895 1900 



30 



50 



70 



CCA CGA CTA GAC AGA GCC GCG GAT ATC ACT ACC CAA AAT GCT CAC GAC 5760 
Pro Arg Leu Asp Arg Ala Ala Asp lie Thr Thr Gin Asn Ala His Asp 
1905 1910 1915 1920 



35 AGC GCA ATA GTC GCT CTG CGG CAG AAT ATA CCT ACA CCG GCA CCT TTA 5303 

Ser Ala lie Val Ala Leu Arg Gin Asn lie Pro Thr Pro Ala Pro Leu 
1925 1930 1935 

TCA TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC 58 5 6 

40 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie 
1940 1945 1950 

AAT GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC 5904 

Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 
45 1955 1960 1965 

AAT CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA 5952 

Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 

1970 1975 1980 



ATC TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT 6000 
lie Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
1985 1990 1995 2000 



55 GCC ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG 604 8 
Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 
2005 2010 2015 

TGG CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG 609 6 
60 Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 
2020 2025 2030 

CTC ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC 6144 
Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He lie Glu Arg Gin Asp 
65 2035 2040 2045 

CCG GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA 6192 
Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu lie 
2050 2055 2060 



TTG ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC 6240 
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15 



35 



55 



Leu Thr Asn Leu .Ser He Gin Asp Lys Thr 1 1-9 Glu Glu Leu Asp Ala 
2055 20"0 20" 203C 

GAG .AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA CAA TCG CGC TTT 6 23 i 

Glu Lys Thr 7a 1 Leu Glu Lys Ser Lys Ala Glv Ala Gin Ser Arg ph«» 
2085 2090 2095 

GAT AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC AAC GCC GGT GAA AAC 63 3 6 

Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 
2100 2105 2110 

CAA GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT -S3 34 

Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 

2115 2120 2125 

CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 64 3 2 

Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2135 2140 



20 TTC GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG 64 30 
Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 
2145 2150 2155 2160 

AC A GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG 6528 
25 Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 

2165 2170 2175 

GAT AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG 6 5" 6 
Asp Lys He Ser Gin ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 
3D 2180 2185 2190 

GAG ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT 6624 
Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 
2195 2200 2205 



CAG CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA 6 67 
Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 



40 ACC AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC 67 20 
Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
2225 2230 2235 2240 

CTG CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT 67 68 
45 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 

2245 2250 2255 

CGA CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT 6316 
Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
50 2260 2265 2270 

TGC CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT 6364 
Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
2275 2280 2285 



GCC CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG 6912 
Ala Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 
2290 2295 2300 



60 CTT GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT 6960 
Leu Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala 
2305 2310 2315 2320 

CAT CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC AC A GTA TCG 7 008 
65 His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 

2325 2330 2335 

CTG GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC 7 056 
Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
70 2340 2345 2350 



-193- 



SUBST1TUTE SHEET (RULE 26) 



CTG GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC ACT GCC 
Leu Ala Gin Glu He Asp Lys L«u Val Ser Gin Gly Ser Gly Ser Ala 
2355 2360 2365 

GGC AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA 
Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 
2370 2375 2330 

ACC TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA 
Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys lie Arg Glu 
2385 2390 2395 2 40O 

GAT TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC 
Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser 
2405 2410 2415 

GTC ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA 
Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ma lie 
2420 2425 2430 

TTG TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG 
Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 
2435 2440 2445 

GCA GTT TCT CAC GGT ATC AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC 
Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 
2450 2455 2460 

AAC GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC 
Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly lie Ala He Asp Gin Gly 
2465 2470 2475 2430 

ACG CTG AC A CTG AGC TTC CCA AAT GCA TCT ATC CCG GAG AAA GGT AAA 
Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 
2485 2490 2495 

CAA GCC ACT ATC TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC 
Gin Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg 
2500 2505 2510 

TAC ACC ATT AAA TAA 7551 
Tyr Thr He Lys ••• 
2516 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 516 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47 (TcdA): 
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25 



30 



Mac Asn clu Ser Val Lys clu He Pro Asp Val Leu Lys Ser Gin Cys 
5 Gly Phe Asn Cys Leu Thr As P U ser His Ser Ser Phe Asn clu Phe 
Arg Gin Gin Val Ser clu His Leu Ser Trp Ser Glu Thr His Asp Leu 

Tyr Hxs Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu clu 

33 60 
)5 Arg He Leu Lys A rg Ala Asn Pro Gin Leu cin Asn A la Val His Leu 

Ala He Leu Ala Pro Asn Ala Glu Leu He cly Tyr Asn Asn Gin Phe 

20 ser Gly Arg Ma Ser Gin Tvr Val Ala Pro Gly Thr Val ser s" Mec 

Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Ar g Glu Ala Arg Asn 

120 125 
Leu Hi, Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 

135 140 
Lju Lys Ser Mec Ala Leu Ser Gin Gin Asn Mec A s P He Glu Leu Ser 

Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Clu Ser He Lys Thr Glu 

170 175 
35 ser Lys Leu Glu Asn Tyr Thr Lys Val Mec Glu Mec Leu ser Thr Phe 

185 190 
Arg Pro Ser Gly Ala Thr Pro Tyr His Asp AlA ^ ?Ju Asn Val Arg 

Glu Val He Gin L ,u Gin As P Pro Gly Leu Glu Gin Leu Asn Ala Ser 

215 220 

45 SI? ^ 116 Ma Gly L S Met His Gln Al*-s.r Leu Leu Gly He Asn 

°° 235 240 

Ala ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu clu He Thr 

250 255 

50 Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Giy Asn lie Glu 

Pro Ala ser Leu Ala Met Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 
55 280 285 

ser Asp Glu Glu Leu Ser Gin Phe lie Gly Lys Ala Ser Asn Phe Giy 

300 

60 ?Js GiU 1Vr S " Tio ° ln L6U 116 Thr Pro Val VAl Asn ^r 

Ser Asp Gly Thr Val Lys Val Tyr Arg lie Thr Arg Glu Tyr Thr Thr 
J25 33 ° 335 

65 Asn Ma Tyr Gin Mec A sp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 

345 350 

Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 
70 360 365 



Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Al 
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370 375 330 

Pro Gin Val Asn He Glu Tyr Ser Ala Asn He Thr Leu Asn Thr Ala 
335 390 395 400 

5 

Asp He Ser Gin Pro Phe Glu He Gly Leu Thr Arg Val Leu Pro Ser 
405 410 415 

Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 
10 420 425 430 

Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala lie Arg Leu ser Arg 
435 440 445 

15 Ala Thr Glu Leu Ser Pro Thr He Leu Glu Gly He Val Arg Ser Val 
450 455 460 



20 



35 



50 



65 



Asn Leu Gin Leu Asp He Asn Thr Asp Val Leu Gly Lys Val Phe Leu 

465 470 475 430 

Thr Lys Tyr Tyr Met Gin Arg Tyr Ala He His Ala Glu Thr Ala Leu 

485 490 495 



He Leu Cys Asn Ala Pro He Ser Gin Arg Ser Tyr Asp Asn Gin Pro 
25 500 505 510 

Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 
515 520 525 

30 Phe Ser Thr Gly Asp Glu Glu lie Asp Leu Asn Ser Gly Ser Thr Gly 
530 535 540 



Asp Trp Arg Lys Thr He Leu Lys Arg Ala Phe Asn He Asp Asp Val 

545 550 555 560 

Ser Leu Phe Arg Leu Leu Lys He Thr Asp His Asp Asn Lys Asp Gly 

565 570 575 



Lys He Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr lie Gly Lys 
40 580 585 590 

Leu Leu Ala Asp He His Gin Leu Thr He Asp Glu Leu Asp Leu Leu 
595 600 605 

45 Leu He Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala lie Ser Asp 
610 615 620 



Lys Gin Leu Ala Thr Leu He Arg Lys Leu Asn Thr He Thr Ser Trp 

625 630 635 640 

Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe He Mec Thr Ser 

645 650 655 



Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu lie Lys Asn Leu Leu Asp 

55 660 665 670 

Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 

675 680 685 

60 Leu His Val Mec Ala Pro Tyr lie Ala Ala Thr Leu Gin Leu Ser Ser 

690 695 700 



Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 

705 710 715 720 

Gly Asp Gly Ala Mec Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 

725 730 735 



Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 
70 740 745 750 
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Yal Gin Tyr Cys Gin Ala Leu Ala Gin L«u Glu Met Val Tyr His 3*r 
"55 "60 765 

Thr Gly lie Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro clu 
770 775 780 

Mec Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Lsu ser 
735 790 795 8 00 

Leu lie Mec Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Gly Glu 
805 310 ai5 

Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr Ma 
320 825 330 

Glu Gin Leu Ala Asp Ala Met Asn Leu Asp Ala Asn Leu Leu Leu Gin 
835 840 345 

Ala Ser lie Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 
850 855 860 

Glu Asn Ala Phe ser Cys Trp Thr Ser lie Asn Thr He Leu Gin Trp 
865 870 875 330 

Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser ^la 
88S 890 855 

Leu Val Gly Leu Asp Tyr He Gin Ser Met Lys Glu Thr Pro Thr Tyr 
900 905 910 

Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 
915 920 925 

Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala 
930 935 940 

Ala Leu Ser Thr Tyr Tyr lie Arg Gin Val Ala Lys Ala Ala Ala Ala 
945 950 955 960 

He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu lie Asp Asn Gin 
965 970 975 

Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala ser 
980 985 990 

He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 
995 1000 1005 

Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 
1010 1015 * 1020 

Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 
1025 1030 1035 1040 

Pro Glu Asn Tyr He Asp Pro Thr Met Arg He Gly Gin Thr Lys Mec 
1045 1050 1055 

Met Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 
1060 1065 1070 

Thr Val Glu Asp Ala Phe Met Ser Tyr Leu Thr Ser Phe Glu Gin Val 
1075 1080 1085 

Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn He Asn Asn Asp 
1090 1095 1100 

Gin Gly Leu Thr Tyr Phe He Gly Leu Ser Glu Thr Asp Ala Gly Glu 
H05 1110 1115 1120 



Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Phe Asn Asp Gly Lys Phe 
1125 1130 U35 
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10 



25 



40 



55 



70 



Ala Ala Asn Ala Trp Ser Glu Trp His Lys lie Asp Cys Pro lie Asn 
1140 1145 1150 

Pro Tyr Lys Ser Thr lie Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 
1155 1160 1155 

Leu Leu Trp Leu Glu Gin Lys Glu He Thr Lys Gin Thr Gly Asn Ser 
1170 1175 1180 

Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 
1135 1190 1195 1200 



Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro lie Thr Phe Asp 
15 1205 1210 1215 

Val Asn Lys Lys He Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 
1220 1225 1230 

20 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Met 
1235 1240 1245 



Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Mec 
1250 1255 i260 

Gin Gly Leu Tyr He Phe Ala Asp Met Ala Ser Lys Asp Met Thr Pro 
1265 1270 1275 1280 



Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 
30 1285 1290 1295 

Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He 
1300 1305 1310 

35 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 
1315 1320 1325 



Leu Ser Met Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr Lys Ala 
1330 1335 1340 

Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg He He 
1345 1350 1355 1360 



His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Met Asn 
45 1365 1370 1375 

Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 
1380 1385 1390 

50 Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Met Phe Tyr Pro Val Tyr 

1395 1400 1405 



Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 
1410 1415 1420 

His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He Pro Gly 
1425 1430 1435 1440 



Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 
hO 1445 1450 1455 

Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 
1460 1465 1470 

65 Met Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 
1475 1480 1485 



He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys Ala 

1490 1495 1500 

Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser He Gin 
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iv: : 



1510 



1515 



Pro ser Pro Ser Phe Asp Glu M*t Asn T/r Gin Phe Asn Ala L&u Glu 

1525 1530 1535 

5 

He Asp Gly Ser Gly L u Asn Phe He Asn Asn ser Ala Ser He Asp 

1540 1545 1550 

Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu 

10 1555 1560 1565 

Ser Fhe 3er He Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 

1570 1575 1580 



15 



Thr Leu His His Asn Glu Asn Gly Ala Gin T/r Met Gin Trp Gin Ser 
1585 1590 1595 1600 



20 



Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 
1605 1610 1615 

Ala Thr Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin Asn He 
1620 1625 1630 



25 



Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 
1635 1640 1645 



30 



Pro T/r Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 

1650 1655 1660 

He Lys His Val Val Asp Asn Asn Ser His He He T/r Ser Gly Gin 
1665 1670 1675 1660 



35 



Leu Thr Asp Thr Asn He Asn He Thr Leu Phe He Pro Leu Asp Asp 
1685 1690 1695 

Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Mec Thr Phe Lys 
1700 1705 1710 



40 



Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 
1715 1720 1725 



45 



Asp Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr His Phe 
1730 1735 1740 

Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Met Asp Phe 

1745 1750 1755 1760 



50 



Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 
1765 1770 1775 

Met Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 
1780 1785 1790 



Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 
55 1795 1800 1805 

Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 
1810 1815 1820 

60 Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 

1325 1830 1335 1340 



65 



Ala Gin His Asp Pro Met His Tyr Lys Val Ser Thr Phe Met Arg Thr 
1845 1850 1855 

Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 
1860 1865 1870 



Arg Asp Thr Leu Asn Glu Ala Lys Met Trp Tyr Met Gin Ala Leu His 
70 1375 1880 1885 
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Leu Lz'J Oiv Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 
1390 ' 1395 1*00 

Pro Arg Leu Asp Arg Ala Ala Asp He Thr Thr Gin Asn Ala His Asp 

5 1305 1310 1915 1920 

Ser Ala. He Val Ala Leu Ara Gin Asn He Pro Thr Pro Ala Pro Leu 
1925 1930 1935 

10 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He 
1940 1945 1950 



15 



30 



45 



60 



Asn Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 
1955 1960 1965 

Asn Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro 

1970 1975 1980 



He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 
20 1985 1990 1995 2000 

Ala Thr ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu 
2005 2010 2015 

25 Trp Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin 
2020 2025 2030 



Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He He Glu Arg Gin Asp 
2035 2040 2045 

Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 
2050 2055 2060 



Leu Thr Asn Leu Ser He Gin Asp Lys Thr lie Glu Glu Leu Asp Ala 
35 2065 2070 2075 2080 

Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 
2085 2090 2095 

40 Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 
2100 2105 2110 



Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 
2115 2120 2125 

Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 
2130 2135 2140 



Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 
50 2145 2150 2155 2160 

Thr Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala 
2165 2170 2175 

55 Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 
2180 2185 2190 



Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 
2195 2200 2205 

Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 
2210 2215 2220 



Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 
65 2225 2230 2235 2240 

Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 
2245 2250 2255 

70 Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 
2260 2265 2270 
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Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 
2275 2280 2235 

Ala Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Cly Leu 
2290 2295 2300 

Leu Ala Gly Glu Thr Leu Mec Leu Ser Leu Ala Gin Met Glu Asp Ma 
2305 2310 2315 2320 

His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 
2325 2330 23 35 

Leu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 
2340 2345 2350 

Leu Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 
2355 2360 2365 

Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lvs 
2370 2375 2380 

Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu 
2385 2390 2395 2400 

Asp Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser 
2405 2410 2415 

Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 
2420 2425 2430 

Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 
2435 2440 2445 

Ala Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 
2450 2455 2460 

Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 
2465 2470 2475 2480 

Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys 
2485 2490 2495 

Gin Ala Thr Mec Leu Lys Thr Leu Asn Asp He He Leu His He Arg 
2500 2505 2510 

Tyr Thr He Lys 
2516 

(2) INFORMATION FOR SEQ ID NO: 48: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5547 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 48 (ccdAii coding region 

CTG ATA GGC TAT AAC AAT CAA TTT AGC GGT AGA GCC AGT CAA TAT GTT 48 
Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 
15 10 15 

GCG CCG GGT ACC GTT TCT TCC ATG TTC TCC CCC GCC GCT TAT TTG ACT 96 
Ala Pro Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr L u Thr 
20 25 30 
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CAA CTT 
Giu LcU 

5 

TAT CTG 
Tyr Leu 
50 

H) CAA AAT 
Gin Asn 
65 

TTA TTG 
15 Leu Leu 



GTG ATG 
Val Mec 

20 

CAT GAT 
His Asp 

25 

GGA CTT 
Gly Leu 
130 

30 CAA GCC 
Gin Ala 
145 

AAT ATT 
35 Asn lie 



AAG AAA 
Lys Lys 

40 

TAC CTT 
Tyr Leu 

45 

ATT GGT 
He Gly 
210 

50 CTT ATT 
Leu He 
225 

CGG ATC 
55 Arg He 



CTA TTT 
Leu Phe 

60 

AAT TTT 
Asn Phe 

65 

GAA CTT 
Glu Leu 
290 

70 GCA AAT 
Ala Asn 



TAT CGT GAA 
Tyr Arg Glu 
35 

GAT ACC CGC 
Asp Thr Arg 



ATG GAT ATA 
Mec Asp He 



GAA AGC ATT 
Giu Ser He 
85 

GAA ATG CTC 
Glu Met Leu 
100 

GCT TAT GAA 
Ala Tyr Glu 
115 

GAG CAA CTC 
Glu Gin Leu 



TCC CTA TTG 
Ser Leu Leu 



CTG ACG GAG 
Leu Thr Glu 
165 

AAT TTT GGT 
Asn Phe Gly 
180 

AAA CGT TAT 
Lys Arg Tyr 
195 

AAA GCC AGC 
Lys Ala Ser 



ACT CCG GTA 
Thr Pro Val 



ACC CGC GAA 
Thr Arg Glu 
245 

CCC TTC GGT 
Pro Phe Gly 
260 

TAT AAT GCC 
Tyr Asn Ala 
275 

GTT CGA ACT 
Val Arg Thr 



ATC ACA TTA 
He Thr Leu 



GCA CGC AAT 
Ala Arg Asn 
40 

CGC CCA GAT 
Arg Pro Asp 
55 

GAA TTA TCC 
Glu Leu Ser 
70 

AAA ACT GAA 
Lys Thr Glu 



TCC ACT TTC 
Ser Thr Phe 



AAT GTG CGT 
Asn Val Arg 
120 

AAT GCA TCA 
Asn Ala Ser 
135 

GGT ATT AAC 
Gly He Asn 
150 

GAG ATT ACC 
Glu He Thr 



AAT ATC GAA 
Asn He Glu 



TAT AAT TTA 
Tyr Asn Leu 
200 

AAT TTT GGT 
Asn Phe Gly 
215 

GTC AAC AGC 
Val Asn Ser 
230 

TAT ACA ACC 
Tyr Thr Thr 



GGT GAG AAT 
Gly Glu Asn 



TCT TAT TTA 
Ser Tyr Leu 
280 

GAA GGC GCT 
Glu Gly Ala 
295 

AAT ACC GCT 
Asn Thr Ala 



TTA CAC GCA 
Leu His Ala 



CTC AAA TCA 
Leu Lys Ser 



ACA CTC TCT 
Thr Leu Ser 
75 

TCT AAA CTG 
Ser Lys Leu 
90 

CGT CCT TCC 
Arg Pro Ser 
105 

GAA GTT ATC 
Glu Val He 



CCG GCA ATT 
Pro Ala He 



GCT TCA ATC 
Ala Ser He 
155 

GAA GGT AAT 
Glu Gly Asn 
170 

CCG GCC TCA 
Pro Ala Ser 
185 

AGC GAT GAA 
Ser Asp Glu 



CAA CAG GAA 
Gin Gin Glu 



ACT GAT GGC 
Ser Asp Gly 
235 

AAT GCT TAT 
Asn Ala Tyr 
250 

TAT CGG TTA 
Tyr Arg Leu 
265 

TCC ATC AAG 
Ser He Lys 



CCT CAA GTC 
Pro Gin Val 



GAT ATC AGT 
Asp He ser 

-202- 



AGT GAC TCC 
Ser Asp Ser 
45 

ATG CCG CTC 
Mec Ala Leu 
60 

TTG TCC AAT 
Leu Ser Asn 



GAA AAC TAT 
Glu Asn Tyr 



GGC GCA ACG 
Gly Ala Thr 
110 

CAG CTA CAA 
Gin Leu Gin 
125 

GCC GGG TTG 
Ala Gly Leu 
140 

TCG CCT GAG 
Ser Pro Glu 



GCT GAG GAA 
Ala Glu Glu 



TTG GCT ATG 
Leu Ala Mec 
190 

GAA CTT AGT 
Glu Leu Ser 
205 

TAT AGT AAT 
Tyr Ser Asn 
220 

ACG CTT AAG 
Thr Val Lys 



CAA ATG GAT 
Gin Mec Asp 



GAT TAT AAA 
Asp Tyr Lys 
270 

TTA AAT GAT 
Leu Asn Asp 
285 

AAT ATA GAA 
Asn He Glu 
300 

CAA CCT TTT 
Gin Pro Phe 



GTT TAT UJ 
Val Tyr 



AGT CAG 192 
Ser Gin 



GAG CTG 240 
Glu Leu 
80 

ACT AAA 2 33 
Thr Lys 
95 

CCT TAT 3 36 
Pro Tyr 



GAT CCT 3 34 
Asp Pro 



ATG CAT 43 2 
Met His 



CTA TTT 480 
Leu Phe 
160 

CTT TAT 528 
Leu Tyr 
175 

CCG GAA 576 
Pro Glu 



CAG TTT 624 
Gin Phe 



AAC CAA 672 
Asn Gin 



GTA TAT 720 
Val Tyr 
240 

GTG GAG 7 68 
Val Glu 
255 

TTC AAA 816 
Phe Lys 



AAA AGA 3 64 
Lys Arg 



TAC TCC 912 
Tyr Ser 



GAA ATT 960 
Glu He 
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305 310 315 j 20 

GGC CTG AC A CGA GTA CTT CCT TCC GGT TCT TGG GCA TAT GCC GCC 3CA 1003 
Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tvr Ala Ala Ala 
5 325 330 335 

AAA TTT ACC GTT GAA GAG TAT AAC CAA TAC TCT TTT CTG CTA AAA CTT 1056 

Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 

340 345 350 

10 

AAC AAG GCT ATT CGT CTA TCA CGT GCG AC A GAA TTG TCA CCC ACG ATT 1104 

Asn Lys Ala lie Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr lie 

355 360 365 

15 CTG GAA GGC ATT GTG CGC AGT GTT AAT CTA CAA CTG GAT ATC AAC AC A 1152 
Leu Glu Gly He Val Arg Ser Val Asn Leu cln Leu Asp He Asn Thr 
370 375 380 

GAC GTA TTA GGT AAA GTT TTT CTG ACT AAA TAT TAT ATG CAG CGT TAT 1200 
20 Asp Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Met Gin &rg Tyr 
385 390 395 400 

GCT ATT CAT GCT GAA ACT GCC CTG ATA CTA TGC AAC GCG CCT ATT TCA 1248 
Ala He His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro Ha ser 
25 405 410 415 



30 



50 



70 



CAA CGT TCA TAT GAT AAT CAA CCT AGC CAA TTT GAT CGC CTG TTT AAT 1296 
Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 
420 425 430 

ACG CCA TTA CTG AAC GGA CAA TAT TTT TCT ACC GGC GAT GAG GAG ATT 13 44 
Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 
435 440 445 



35 GAT TTA AAT TCA GGT AGC ACC GGC GAT TGG CGA AAA ACC ATA CTT AAG 13 92 
Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys 
450 455 460 

CGT GCA TTT AAT ATT GAT GAT GTC TCG CTC TTC CGC CTG CTT AAA ATT 1440 
40 Arg Ala Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys He 
465 470 475 480 

ACC GAC CAT GAT AAT AAA GAT GGA AAA ATT AAA AAT AAC CTA AAG AAT 1488 
Thr Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn 
45 485 490 495 

CTT TCC AAT TTA TAT ATT GGA AAA TTA CTG GCA GAT ATT CAT CAA TTA 153 6 
Leu Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu 
500 505 510 



ACC ATT GAT GAA CTG GAT TTA TTA CTG ATT GCC GTA GGT GAA GGA AAA 1584 
Thr He Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys 
515 520 525 



55 ACT AAT TTA TCC GCT ATC AGT GAT AAG CAA TTG GCT ACC CTG ATC AGA 1632 
Thr Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg 
530 535 540 

AAA CTC AAT ACT ATT ACC AGC TGG CTA CAT ACA CAG AAG TGG AGT GTA 1680 
60 Lys Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val 
545 550 555 560 

TTC CAG CTA TTT ATC ATG ACC TCC ACC AGC TAT AAC AAA ACG CTA ACG 1728 
Phe Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 
65 565 570 575 

CCT GAA ATT AAG AAT TTG CTG GAT ACC GTC TAC CAC GGT TTA CAA GGT 177 6 
Pro Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly 
580 585 590 



TTT GAT AAA GAC AAA GCA GAT TTG CTA CAT GTC ATG GCG CCC TAT ATT 1324 
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GCG GCC 
5 .-.la Ala 
610 

CTT TGG 

Leu Trp 
10 s2S 

AAA TTC 
Lys Phe 

15 

GCC GTA 
Aia 7a 1 



21) CAA TTG 
Gin Leu 



CGT CTA 
25 Arg Leu 
690 

GCG CCC 
Ala Pro 
30 705 

GAT TGG 
Asp Trp 

35 

TTT GAA 
Phe Glu 



40 CTT GAT 
Leu Asp 



CAA CAT 
45 Gin His 
770 

TCT ATC 
Ser lie 
50 785 

GTC GCC 
7a 1 Ala 

55 

TCA ATG 
Ser Met 



60 GTA TTA 
Vai Leu 



TTT CTG 
65 Phe Leu 
350 

CAA GTC 
Gin 7a 1 
70 i65 



Lys Asp Lys 
595 

ACC TTG CAA 
Thr Leu Gin 



GCA GAT AAG 
Ala Asp Lys 



TGG GAC TGG 
Trp Asp Trp 
645 

GAA ACG CAG 
Glu Thr Gin 
660 

GAA ATG GTT 
Glu Met Val 
675 

TTT GTC AC A 
Phe Val Thr 



GCG CAT GAT 
Ala His Asp 



GTG AAC GCA 
Val Asn Ala 
725 

GCT AAC TCG 
Ala Asn Ser 
740 

GCT AAT TTG 
Ala Asn Leu 
755 

CTT CCC CCA 
Leu Pro Pro 



AAT ACT ATC 
Asn Thr He 



CCA CAG GGC 
Pro Gin Gly 
805 

AAA GAG ACA 
Lys Glu Thr 
820 

ACC GCC GGG 
Thr Ala Gly 
335 

GAT GAA TCT 
Asp Glu Ser 



GCC AAG GCA 
Ala Lys Ala 



Ala .Asp Lsu 

60 0 

TTA TCA TCG 
Leu Ser Ser 
615 

TTA CAG CCC 
Leu Gin Pro 
63 0 

TTG AAT ACT 
Leu Asn Thr 



GAA CAT ATC 
Glu His He 



TAC CAT TCC 
Tyr His Ser 
680 

AAA CCA GAG 
Lys Pro Glu 
695 

GCC CTT TCA 
Ala Leu Ser 
710 

CTA GGC GAA 
Leu Gly Glu 



TTA ACG GCA 
Leu Thr Ala 



CTG TTG CAA 
Leu Leu Gin 
760 

GTA ACT CCA 
Val Thr Pro 
775 

CTG CAA TGG 
Leu Gin Trp 
790 

GTT TCC GCT 
Val Ser Ala 



CCG ACC TAT 
Pro Thr Tyr 



TTG AAT TCA 
Leu Asn Ser 
840 

CGC AGT GCC 
Arg Ser Ala 
855 

GCG GCG GCT 
Ala Ala Ala 
37 0 



Leu His Val 



GAA AAT GTC 
Glu Asn Val 



vjGC G.-iC VjGC 
Gly Asp Gly 
635 

AAG TAT ACG 
Lys Tyr Thr 
650 

GTT CAG TAT 
Val Gin Tyr 
665 

ACC GGC ATC 
Thr Gly He 



ATG TTT GGC 
Met Phe Gly 



CTG ATT ATG 
Leu lie Mec 
715 

AAA GCG TCC 
Lys Ala Ser 
730 

GAA CAA CTG 
Glu Gin Leu 
745 

GCC AGT ATT 
Ala Ser He 



GAA AAT GCG 
Glu Asn Ala 



GTT AAT GTC 
Val Asn Val 
795 

TTG GTC GGG 
Leu Val Gly 
810 

GCC CAG TGG 
Ala Gin Trp 
825 

CAA CAG GCT 
Gin Gin Ala 



GCA TTA AGC 
Ala Leu Ser 



ATT AAA AGC 
He Lys Ser 
375 
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Mec Ala Pro 

6 05 

GCC CAC TCG 
Ala His Ser 
620 

GCA ATG ACA 
Ala Mec Thr 



CCG GGT TCA 
Pro Gly Ser 



TGT CAG GCT 
Cys Gin Ala 
670 

AAC GAA AAC 
Asn Glu Asn 
685 

GCT GCA ACT 
Ala Ala Thr 
700 

CTG ACA CGT 
Leu Thr Arg 



TCG GTG CTA 
Ser Val Leu 



GCT GAT GCC 
Ala Asp Ala 
750 

CAA GCA CAA 
Gin Ala Gin 
765 

TTC TCC TGT 
Phe Ser Cys 
780 

GCA CAA CAA 
Aia Gin Gin 



CTG GAT TAT 
Leu Asp Tyr 



GAA AAC GCG 
Glu Asn Ala 
830 

AAT ACA TTA 
Asn Thr Leu 
845 

ACC TAC TAT 
Thr Tyr Tyr 
860 

CGT GAT GAC 
Arg Asp Asp 



Tyr lie 



GTA CTC 1372 
Val Leu 



GCA GAA 192 0 
Ala Glu 
640 

TCG GAA 1963 
Ser Glu 
655 

CTG GCA 2016 
Leu Ala 



GCC TTC 206 4 
Ala Phe 



GGA GCA 2112 
Gly Ala 



TTT GCG 2160 
Phe Ala 
720 

GCG GCA 2208 
Ala Ala 
735 

ATG AAT 2256 
Mec Asn 



AAT CAT 2 3 04 
Asn His 



TGG ACA 2352 
Trp Thr 



TTG AAT 2 4 00 
Leu Asn 
800 

ATT CAA 2443 
He Gin 
315 

GCA GGC 2 496 
Ala Gly 



CAC GCT 2544 
His Ala 



ATC CGT 2592 
He Arg 



TTG TAT 2 64 0 
Leu Tyr 
380 
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20 



40 



6(1 



CAA TAC TTA CTG ATT GAT AAT CAG GTT TCT GCG OCA ATA AAA ACC A:: Liii 
Gin Tyr Leu Leu lie Asp Asn Gin Val Ser Ala Ala lie Lys Thr ""hr 
335 390 395 

CGG ATC GCC GAA GCC ATT GCC AGT ATT CAA CTG TAC GTC AAC CGG GCA 2" 3 5 
Arg IU Ala Glu Ala Hi Ala Ser lie Gin Leu Tyr Val Asn Arg Ala 
900 905 910 

TTG GAA AAT CTG GAA GAA AAT GCC AAT TCG GGG GTT ATC AGC CGC CAA -"3-1 
Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He ser Arg Gin 
915 920 925 

TTC TTT ATC GAC TGG GAC AAA TAC AAT AAA CGC TAC AGC ACT TCG GCG 233 2 
Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg T/r Ser Thr Trp Ala 
930 935 940 

GGT GTT TCT CAA TTA GTT TAC TAC CCG GAA AAC TAT ATT GAT CCG ACC 23 30 
Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 
945 950 955 960 

ATG CCT ATC GGA CAA ACC AAA ATG ATG GAC GCA TTA CTG CAA TCC GTC 2923 
Mec Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 
965 970 975 



25 AGC CAA AGC CAA TTA AAC GCC GAT ACC GTC GAA GAT GCC TTT ATG TCT 2 976 

Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Aia Phe Met Ser 

980 985 990 

TAT CTG AC A TCG TTT GAA CAA GTG GCT AAT CTT AAA GTT ATT AGC GCA 3 02 4 

30 Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 

995 1000 1005 

TAT CAC GAT AAT ATT AAT AAC GAT CAA GGG CTG ACC TAT TTT ATC GGA 3 07 2 

Tyr His Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gly 

35 1010 1015 1020 

CTC AGT GAA ACT GAT GCC GGT GAA TAT TAT TGG CGC AGT GTC GAT CAC 3120 

Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 

1025 1030 1035 1040 



AGT AAA TTC AAC GAC GGT AAA TTC GCG GCT AAT GCC TGG AGT GAA TGG 316a 
Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 
1045 1050 1055 



45 CAT AAA ATT GAT TCT CCA ATT AAC CCT TAT AAA AGC ACT ATC CGT CCA 3216 
His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 
1060 1065 1070 

GTG ATA TAT AAA TCC CGC CTG TAT CTG CTC TGG TTG GAA CAA AAG GAG 3264 
50 Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 
1075 1080 1085 

ATC ACC AAA CAG ACA GGA AAT AGT AAA GAT GGC TAT CAA ACT GAA ACG 3 312 
He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 
55 1090 1095 1100 

GAT TAT CGT TAT GAA CTA AAA TTG GCG CAT ATC CGC TAT GAT GGC ACT 3 360 
Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 
1105 1110 1115 1120 



TGG AAT ACG CCA ATC ACC TTT GAT GTC AAT AAA AAA ATA TCC GAG CTA 3 4 0 3 
Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu 
1125 1130 1135 



65 AAA CTG GAA AAA AAT AGA GCG CCC GGA CTC TAT TGT GCC GGT TAT CAA 3 45 5 

Lys Leu Glu Lys Asn Arg Ala Pro Gly Lau Tyr Cys Ala Gly T/r Gin 
1140 1145 1150 

GGT GAA GAT ACG TTG CTG GTG ATG TTT TAT AAC CAA CAA GAC ACA CTA 3 504 

70 Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 

1155 1160 1165 
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CAT AGT TAT AAA AAC GCT TCA ATG CAA GGA CTA TAT ATC TTT GCT GAT 3 5 52 

Asp Ser Tyr Lys Asn Ala Ser Mec Gin Gly Leu Tyr lie Phe Ala Asp 

1170 1175 1180 

5 

ATG GCA TCC AAA GAT ATG ACC CCA GAA CAG AGC AAT GTT TAT CGG GAT 3 6 00 

Met Ala Ser Lys Asp Mec Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 
1135 1190 1195 1200 

H) AAT AGC TAT CAA CAA TTT GAT ACC AAT AAT GTC AGA AGA GTG AAT AAC 3 6-13 
Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 
1205 1210 1215 

CGC TAT GCA GAG GAT TAT GAG ATT CCT TCC TCG GTA AGT AGC CGT .AAA 3 69 6 
15 Arg Tyr Ala Glu Asp Tyr Glu lie Pro Ser Ser Val Ser Ser Arg Lys 
1220 1225 1230 

GAC TAT GGT TCG GGA GAT TAT TAC CTC AGC ATG GTA TAT AAC GGA GAT 37 4 4 
Asp Tyr Gly Trp Gly Asp Tyr Tyr Leu Ser Mec Val Tyr Asn Gly Asp 
20 1235 1240 1245 

ATT CCA ACT ATC AAT TAC AAA GCC GCA TCA AGT GAT TTA AAA ATC TAT 3 79 2 
lie Pro Thr lie Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys He Tyr 
1250 1255 1260 

25 

ATC TCA CCA AAA TTA AGA ATT ATT CAT AAT GGA TAT GAA GGA CAG AAG 3 340 
He Ser Pro Lys Leu Arg He He His A.sn Gly Tyr Glu Gly Gin Lys 
1265 1270 1275 1230 

30 CGC AAT CAA TCC AAT CTG ATG AAT AAA TAT GGC AAA CTA GGT GAT AAA 3S83 
Arg Asn Gin Cys Asn Leu Mec Asn Lys Tyr Gly Lys Leu Gly Asp Lys 
1285 1290 1295 

TTT ATT GTT TAT ACT AGC TTG GGG GTC AAT CCA AAT AAC TCG TCA AAT 39 36 
35 Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
1300 1305 1310 

AAG CTC ATG TTT TAC CCC GTC TAT CAA TAT AGC GGA AAC ACC AGT GGA 3 984 
Lys Leu Mec Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 
40 1315 1320 1325 

CTC AAT CAA GGG AGA CTA CTA TTC CAC CGT GAC ACC ACT TAT CCA TCT 403 2 
Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 
1330 1335 1340 



45 



n5 



AAA GTA GAA GCT TCG ATT CCT GGA GCA AAA CGT TCT CTA ACC AAC CAA 4 080 
Lys Val Glu Ala Trp He Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 
1345 1350 1355 1360 



50 AAT GCC GCC ATT GGT GAT GAT TAT GCT AC A GAC TCT CTG AAT AAA CCG 4128 

Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 
1365 1370 1375 

GAT GAT CTT AAG CAA TAT ATC TTT ATG ACT GAC AGT AAA GGG ACT GCT 417 6 

55 Asp Asp Leu Lys Gin Tyr He Phe Mec Thr Asp Ser Lys Gly Thr Ala 

1380 1385 1390 

ACT GAT GTC TCA GGC CCA GTA GAG ATT AAT ACT GCA ATT TCT CCA GCA 4 22 4 

Thr Asp Val Ser Gly Pro Val Glu lie Asn Thr Ala He Ser Pro Ala 
60 1395 1400 1405 

AAA GTT CAG ATA ATA GTC AAA GCG GGT GGC AAG GAG CAA ACT TTT ACC 4 272 

Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 
1410 1415 1420 



GCA GAT AAA GAT GTC TCC ATT CAG CCA TCA CCT AGC TTT GAT GAA ATG 4 3 20 
Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Mec 
1425 1430 1435 1440 



70 AAT TAT CAA TTT AAT GCC CTT GAA ATA GAC GGT TCT GGT CTG AAT TTT 4 3 68 
Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 
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1445 1450 1455 

ATT AAC AAC TCA CCC ACT ATT GAT GTT ACT TTT ACC GCA TTT CCG GAG 4416 
He Asn Asn S=r Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 
1460 1465 1470 

GAT GGC CGC AAA CTG GGT TAT GAA AGT TTC AGT ATT CCT GTT ACC CTC 4 4 64 
Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 
1475 1480 1485 

AAG GTA AGT ACC GAT AAT GCC CTG ACC CTG CAC CAT AAT GAA AAT GGT 4 51^ 
Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly 
1490 1495 150 o 

GCG CAA TAT ATG CAA TGG CAA TCC TAT CGT ACC CGC CTG AAT ACT TA 4 5 60 
Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 
±505 1510 1515 1520 

TTT GCC CGC CAG TTC GTT GCA CGC GCC ACC ACC GGA ATC GAT ACA ATT 4 6 03 
Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly lie Asp Thr He 
1525 1530 1535 

CTG AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAG TTA GGC AAA GGT 4 656 
Leu Ser Mec Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lys Glv 
1540 1545 1550 

TTC TAT GCT ACG TTC GTG ATA CCT CCC TAT AAC CTA TCA ACT CAT GGT 47 04 
Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 
1555 1560 1565 

GAT GAA CGT TGG TTT AAG CTT TAT ATC AAA CAT GTT GTT GAT AAT AAT 47 52 
Asp Glu Arg Trp Phe Lys Leu Tyr He Lys His Val Val Asp Asn Asn 
1570 1575 1580 

TCA CAT ATT ATC TAT TCA GGC CAG CTA ACA GAT ACA AAT ATA AAC ATC 4 3 00 
Ser His He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 
i585 1590 1595 1600 

ACA TTA TTT ATT CCT CTT GAT GAT GTC CCA TTG AAT CAA GAT TAT CAC 43 43 
Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 
1605 1610 1615 

GCC AAG GTT TAT ATG ACC TTC AAG AAA TCA CCA TCA GAT GGT ACC TGG 4896 
Ala Lys Val Tyr Mec Thr Phe Lys Lys Ser Pro ser Asp Gly Thr Trp 
1620 1625 1630 

TGG GGC CCT CAC TTT GTT AGA GAT GAT AAA GGA ATA GTA ACA ATA AAC 49 44 
Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn 
1635 1640 1645 

CCT AAA TCC ATT TTG ACC CAT TTT GAG AGC GTC AAT GTC CTG AAT AAT 4992 
Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 
1650 1655 1660 

ATT AGT AGC GAA CCA ATG GAT TTC AGC GGC GCT AAC AGC CTC TAT TTC 504 0 
He Ser Ser Glu Pro Mec Asp Phe Ser Gly Ala Asn Ser Leu T/r Phe 
l«5o5 1570 1675 1680 

TGG GAA CTG TTC TAC TAT ACC CCG ATG CTG GTT GCT CAA CGT TTG CTG 5088 
Trp Glu Leu Phe Tyr Tyr Thr Pro Mec Leu Val Ala Gin Arg Leu Leu 
1685 1690 1695 

CAT GAA CAG AAC TTC GAT GAA GCC AAC CGT TGG CTG AAA TAT GTC TGG 513 6 
His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 
1700 1705 1710 

AGT CCA TCC GGT TAT ATT GTC CAC GGC CAG ATT CAG AAC TAC CAG TGG 5134 
3er Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Trp 
1715 1720 1725 

AAC GTC CGC CCG TTA CTG GAA GAC ACC AGT TGG AAC AGT GAT CCT TTC 523 2 
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Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu 
1~30 1735 1740 

GAT TCC GTC GAT CCT GAC GCG GTA GCA CAG CAC GAT CCA ATG CAC TAC 5230 
Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 
1745 1750 1755 1760 

AAA GTT TCA ACT TTT ATG CGT ACC TTG GAT CTA TTC ATA GCA CGC GGC 53 23 
Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu lie Ala Arg Gly 
1765 1770 1775 

GAC CAT GCT TAT CGC CAA CTG GAA CGA GAT AC A CTC AAC GAA GCG AAG 5376 
Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 
1780 1785 1790 

ATG TGG TAT ATG CAA GCG CTG CAT CTA TTA GGT GAC AAA CCT TAT CTA 5424 
Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 
1795 1800 1805 

CCG CTG AGT ACG ACA TGG ACT GAT CCA CGA CTA GAC AG A GCC GCG GAT 54 7 2 
Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 
1310 1815 1820 

ATC ACT ACC CAA AAT GCT CAC GAC AGC GCA ATA GTC GCT CTG CGG CAG 552 0 
lie Thr Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin 
1325 1330 1335 1340 

AAT ATA CCT ACA CCG GCA CCT TTA TCA 5547 
Asn He Pro Thr Pro Ala Pro Leu Ser 
1845 1849 



(2) INFORMATION FOR SEQ ID NO: 49: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1849 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iii MOLECULE TYPE: protein 



(xi) SEQUENCE 


DESCRIPTION: SEQ ID NO: 49 (TcdAii 


Features 


From 


To 


Description 


Peptide 


1 


1849 


TcdAji peptide 


Fragment 


1 


12 


S2 N-terminus (SEQ ID NO 


Fragment 


196 


211 


(SEQ ID NO: 38) 


Fragment 


466 


475 


(SEQ ID NO: 17) 


Fragment 


993 


1004 


(SEQ ID NO:23 ; 12/13 ) 


Fragment 


1297 


1312 


(SEQ ID NO: 18 ) 


Fragment 


1390 


1409 


(SEQ ID NO: 39) 


Fragment 


1532 


1554 


(SEQ ID NO:21; 19/23) 



Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 
1 5 10 15 

Ala Pro Gly Thr Val Ser Ser Met Phe Ser Pro Ala Ala Tyr Leu Thr 
20 25 30 

Glu Leu Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 
35 40 45 

Tyr Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Met Ala Leu Ser Gin 
50 55 60 

Gin Asn Met Asp He Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 
°5 70 75 30 

Leu Leu Glu Ser He Lys Thr Glu Ser Lys Leu Glu Asn Tyr Thr Lys 
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35 90 95 

7a 1 Met Glu Met Leu Ser Thr Phe Arg Pro Ser Gly A La Thr Pro Tyr 
100 105 HO 

His Asp Ala Tyr Glu Asn Val Arg Glu Val lie Gin Leu Gin Asp Pro 
115 120 125 

Gly Leu Glu Gin Leu Asn Ala Ser Pro Ala He Ala Gly Leu Mec His 
10 130 135 140 

Gin Ala Ser Leu Leu Cly He Asn Ala Ser He Ser Pro Glu Leu Phe 
145 150 155 160 

15 Asn He Leu Thr Glu Glu lie Thr Glu Gly Asn Ala Glu Glu Leu Tyr 

165 170 175 



20 



Lys Lys Asn Phe Gly Asn He Glu Pro Ala Ser Leu Ala Mec Pro Glu 
180 185 190 

Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu GLu Leu Ser Gin Phe 
195 200 205 

He Gly Lys Ala Ser Asn Phe Gly Gin Gin Glu Tyr Ser Asn Asn Gin 
25 210 215 220 

Leu He Thr Pro Val Val Asn Ser Ser Asp Gly Thr Val Lys Val Tyr 
225 230 235 240 

30 Arg He Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Met Asp Val Glu 

245 250 255 



35 



50 



65 



Leu Phe Pro Phe Gly Gly Glu Asn Tyr Arg Leu Asp Tyr Lys Phe Lys 
260 265 270 

Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg 

275 280 285 



Glu Leu Val Arg Thr Glu Gly Ala Pro Gin Val Asn lie Glu Tyr Ser 
40 290 295 300 

Ala Asn He Thr Leu Asn Thr Ala Asp He Ser Gin Pro Phe Glu He 

305 310 315 320 

45 Gly Leu Thr Arg Val Leu Pro Ser Gly Ser Trp Ala Tyr Ala Ala Ala 

325 330 335 



Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 
340 345 350 

Asn Lys Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr He 
355 360 365 



Leu Glu Gly He Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr 
55 370 375 380 

Asp Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Mec Gin Arg Tyr 
335 390 395 400 

60 Ala He His Ala Glu Thr Ala Leu lie Leu Cys Asn Ala Pro Us Ser 

405 410 415 



Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 

420 425 430 

Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 
435 440 445 



Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys 
70 450 455 460 
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lro iia ph« Asn lie Asp Asp 7a 1 Ser Leu Phe Arg Leu Leu Lys I i* 
46 5 " 470 475 430 

Thr Asp His Asp Asn Lys Asp Gly Lys lie Lys Asn Asn Leu Lys Asn 
435 430 435 

Leu ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu 
500 505 510 

Thr ll a a sp Glu Leu Asp Leu Leu Leu lie Ala Val Gly Glu Gly Lys 
515 520 525 

Thr Asn Leu Ser Ala lie Ser Asp Lys Gin Leu Ala Thr Leu He Arg 
530 535 540 

Lys Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val 
545 550 555 560 

Phe Gin Leu Phe He Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 
565 570 575 

Pro Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly 
580 585 590 

Phe Asp Lys Asp Lys Ala Asp Leu Leu His Val Met Ala Pro Tyr He 
595 600 605 

Ala Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu 
610 S15 620 

Leu Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Met Thr Ala Glu 
625 630 635 640 

Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 
645 650 655 

Ala Val Glu Thr Gin Glu His He Val Gin Tyr Cys Gin Ala Leu Ala 
660 665 670 

Gin Leu Glu Met Val Tyr His Ser Thr Gly He Asn Glu Asn Ala Phe 
675 680 685 

Arg Leu Phe Val Thr Lys Pro Glu Met Phe Gly Ala Ala Thr Gly Ala 
690 695 700 

Ala Pro Ala His Asp Ala Leu Ser Leu lie Met Leu Thr Arg Phe Ala 

705 710 715 720 

Asp Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Ala 

725 730 735 

Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Met Asn 
740 745 750 

Leu Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His 
755 760 765 

Gin His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser cys Trp Thr 
770 775 780 

"^er He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 
785 790 795 800 

Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr He Gin 
805 810 815 

Ser Met Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 
820 825 830 

Val Leu Thr Ala Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala 
835 840 345 
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Phe Leu Asp Glu 3-=r Ara Ser Ala Ala Leu Ser Thr Tvr Tyr He Aro 
350 355 860 

Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Ara Asp Asp Leu Tyr 
3o5 370 375 " 330 

Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala lie Lys Thr Thr 
885 890 895 

Arg He Ala Glu Ala He Ala Ser He Gin Leu Tyr Val Asn Arg Ala 
900 905 910 

Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin 
915 920 925 

Phe Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 
930 935 940 

Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 
945 950 955 96 0 

Met Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin Ser Val 
965 970 975 

Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Met Ser 
980 985 990 

Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 
995 1000 1005 

Tyr His Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe lie Gly 
1010 1015 1020 

Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 
1025 1030 1035 1040 

Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 
1045 1050 1055 

His Lys He Asp cys Pro He Asn Pro Tyr Lys Ser Thr lie Arg Pro 
1060 1065 1070 

Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 
1075 1080 1085 

He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 
1090 1095 1100 

Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 
1105 mo 1115 1120 

Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu 
1125 1130 1135 

Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 
1140 1145 1150 

Gly Glu Asp Thr Leu Leu Val Met Phe Tyr Asn Gin Gin Asp Thr Leu 
1155 1160 1165 

Asp Ser Tyr Lys Asn Ala Ser Met Gin Gly Leu Tyr lie Phe Ala Asp 
1170 H75 1180 

Met Ala Ser Lys Asp Met Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 
1185 H90 1195 1200 

Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 
1205 1210 1215 

Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 
1220 1225 1230 
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Asp TVr Gly Trp Gly Asp Tyr Tvr Leu Ser Met Val T/r Asn Gly Asp 
1235 1240 1245 

lie Pro Thr lie Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys lie Tyr 
1250 1255 1260 

He Ser Pro Lys Leu Arg He lie His Asn Gly Tyr Glu Gly Gin Lys 
1255 1270 1275 1230 

Ara Asn Gin Cys Asn Leu Met Asn Lys Tyr Gly Lys Leu Gly Asp Lys 
1285 1290 1295 

Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 
1300 1305 1310 

Lys Leu Met Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 
1315 1320 1325 

Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 
1330 1335 1340 

Lys Val Glu Ala Trp He Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 
1345 1350 1355 1360 

Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 
1365 1370 1375 

Asp Asp Leu Lys Gin Tyr He Phe Met Thr Asp Ser Lys Gly Thr Ala 
1380 1385 L390 

Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala He Ser Pro Ala 
1395 1400 1405 

Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 
L410 1415 1420 

Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Met 
1425 1430 1435 1440 

Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 
1445 1450 1455 

He Asn Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 
1460 1465 1470 

Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 
1475 1480 1485 

Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly 
1490 1495 1500 

Ala Gin Tyr Met Gin Trp Gin ser Tyr Arg Thr Arg Leu Asn Thr Leu 
1505 1510 1515 1520 

Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 
1525 1530 1535 

Leu Ser Met Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lys Gly 
1540 1545 1550 

Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 
1555 1560 1565 

Asp Glu Arg Trp Phe Lys Leu Tyr He Lys His Val Val Asp Asn Asn 
1570 1575 1580 

Ser His He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 
1585 1590 1595 1600 

Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His 
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16C5 1610 1615 

Ala Lys Val Tyr Met Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 
1620 1625 1630 

5 

Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr He Asn 
1635 1640 1645 

Pro Lys Ser He Leu Thr His Phe Glu Ser Val Asn Val Leu Asn Asn 
10 1650 1655 1660 

He Ser Ser Glu Pro Met Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 
1665 1670 1675 1630 

15 Trp Glu Leu Phe Tyr Tyr Thr Pro Met Leu Val Ala Gin Arg Leu Leu 

1685 1690 16S5 

His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 
1700 1705 1710 

20 

Ser Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Trp 
1715 1720 1725 

Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu 
25 1730 1735 1740 

Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Met His Tyr 
1745 1750 1755 1760 

30 Lys Val Ser Thr Phe Met Arg Thr Leu Asp Leu Leu He Ala Arg Gly 

1765 1770 1775 



35 



Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 
1780 1785 1790 

Met Trp Tyr Met Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 
1795 1800 1805 



Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 
40 1810 1815 1820 

He Thr Thr Gin Asn Ala His Asp ser Ala He Val Ala Leu Arg Gin 
1825 1830 1835 1840 

45 Asn He Pro Thr Pro Ala Pro Leu ser 

1845 1849 



(2) INFORMATION FOR SEQ ID NO: 50: 
50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1740 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 (TcdAiii coding region; 



60 

TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC AAT 48 

Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin He Asn 
15 10 15 

65 CAA GTG ATC ATG AAT TAC TGG CAG AC A TTA GCT CAG AGA GTA TAC AAT 9 6 
Glu Val Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 
20 25 30 

CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA ATC 14 4 
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Leu Arg 



TAT GCC 
5 Tyr Ala 
50 

ACT TCT 
Thr Set 
10 65 

CGT TTC 
Arg Phe 

15 

ACC CAG 
Thr Gin 



20 GAA GCG 
Glu Ala 



ACT AAC 

25 Thr Asn 

130 

AAA ACG 
Lys Thr 
30 145 

AGC TAC 

Ser Tyr 

35 

GCC ATG 

Ala Met 



40 GCA TCC 
Ala Ser 



GGC TTT 
45 Gly Phe 
210 

GGT TAT 
Gly Tyr 
50 225 

AAA ATT 
Lys He 

55 

ATC CAG 
He Gin 



60 CTC AAA 
Leu Lys 



AGT CTC 
65 Ser Leu 
290 

CAA CGT 
Gin Arg 
70 305 



His Asn Leu 
35 

AC A CCG GCC 
Thr Pro Ala 



CAA GGT GGA 
Gin Gly Gly 



CCG CAC ATG 
Pro His Mec 

as 

TTC GGC TCC 
Phe Gly ser 
100 

CTC AAT GCG 
Leu Asn Ala 
115 

CTG AGC ATT 
Leu Ser He 



GTG TTG GAA 
Val Leu Glu 



GGC AAA CTG 
Gly Lys Leu 
165 

ACG CTA CGA 
Thr Leu Arg 
180 

CGT CTG GCC 
Arg Leu Ala 
195 

GCC GGT GGC 
Ala Gly Gly 



GTG ATG GAA 
Val Mec Glu 



AGC CAA TCT 
Ser Gin Ser 
245 

CGG AAT AAT 
Arg Asn Asn 
260 

TCA CTC GCT 
Ser Leu Ala 
275 

AAA ACC CAA 
Lys Thr Gin 



AAG TTC AGC 
Lys Phe Ser 



Ser He Asp 
40 

GAT CCG AAA 
Asp Pro Lys 
55 

GGC AAG CTA 
Gly Lys Leu 

70 

CTG GAA AAT 
Leu Glu Asn 



ACG TTA CAA 
Thr Leu Gin 



TTA TTA CAA 
Leu Leu Gin 
120 

CAG GAC AAA 
Gin Asp Lys 
135 

AAA TCC AAA 
Lys Ser Lys 
150 

TAC GAT GAG 
Tyr Asp Glu 



GCG TCC GCC 
Ala Ser Ala 



GGT GCG GCG 
Gly Ala Ala 
200 

GGC AGC CGT 

Gly Ser Arg 
215 

TTC TCC GCG 
Phe Ser Ala 
230 

GAA ACC TAC 
Glu Thr Tyr 



GCC GAA GCG 
Ala Glu Ala 



GTA CGC CGC 
Val Arg Arg 
280 

CAA GAA CAG 
Gin Glu Gin 
295 

AAT CAG GCG 
Asn Gin Ala 
310 



Gly Gin Pro 



GCG TTA CTC 
Ala Leu Lsu 



CCG GAA TCA 
Pro Glu Ser 
75 

GCG CGC GGC 
Ala Arg Gly 
90 

AAT ATT ATC 
Asn lie lis 
105 

AAT CAG GCC 
Asn Gin Ala 



ACC ATT GAA 
Thr He Glu 



GCG GGA GCA 
Ala Gly Ala 
155 

AAT ATC AAC 
Asn He Asn 
170 

GCC GGG CTT 
Ala Gly Leu 
135 

GCT GAT CTG 
Ala Asp Leu 



TGG GGG GCT 
Trp Gly Ala 



AAT GTT ATG 
Asn Val Mec 
235 

CGT CCT CGC 

Arg Arg Arg 
250 

GAA TTG AAG 
Glu Leu Lys 
265 

GAA GCC GCC 
Glu Ala Ala 



ACC CAA TCT 
Thr Gin Ser 



TTA TAC AAC 
Leu Tyr Asn 
315 



Leu Tyr Leu 
45 

AGC GCC GCC 
Ser Ala Ala 
60 

TTT ATG TCC 
Phe Mec Ser 



ATG GTT AGC 
Mec Val Ser 



GAA CGT CAG 
Glu Arg Gin 
110 

GCC GAG CTG 
Ala Glu Leu 
125 

GAA TTG GAT 
Glu Leu Asp 
140 

CAA TCG CGC 
Gin Ser Arg 



GCC GGT GAA 
Ala Gly Glu 



ACC ACG GCA 
Thr Thr Ala 
190 

GTG CCT AAC 
Val Pro Asn 
205 

ATC GCT GAG 
He Ala Glu 
220 

AAC ACC GAA 
Asn Thr Glu 



CGT CAG GAG 

Arg Gin Glu 



CAA ATC GAT 
Gin He Asp 
270 

GTA TTG CAG 
Val Leu Gin 
285 

CAA TTG GCC 
Gin Leu Ala 
300 

TGG CTG CGT 
Trp Leu Arg 



Pro lie 



GTT GCC 192 
Val Ala 



CTG TGG 24 0 
Leu Trp 

30 

CAG CTC 233 
Gin Leu 
95 

GAC GCG 33 6 
Asp Ala 



ATA TTG 3 84 
He Leu 



GCC GAG 43 2 
Ala Glu 



TTT GAT 4 80 
Phe Asp 
160 

AAC CAA 5 28 
Asn Gin 
175 

GTT CAG 57 6 
Val Gin 



ATC TTC 62 4 
He Phe 



GCG AC A 67 2 
Ala Thr 



GCG GAT 7 20 
Ala Asp 
240 

TGG GAG 768 
Trp Glu 
255 

GCT CAG 816 
Ala Gin 



AAA ACC 864 
Lys Thr 



TTC CTG 912 
Phe Leu 



GGT CGA 9 60 
Gly Arg 
320 
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CTG ZCZ GCG ATT TAC TTC CAG TTC TAC GAT TTG CCC GTC GCG CGT TCC xOC3 

Uu Ala Ala lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg Cys 
325 330 335 

CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT CCC 1056 

L=u Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser Ala 
340 345 350 

CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG CTT 1104 

Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 
355 360 365 

GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT CAT 1152 

Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala His 
370 375 330 

CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG CTG 12 00 

Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 

385 390 395 400 

GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC CTG 12 48 

Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 
405 410 415 



25 GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC AGT GCC GGC 1296 

Ala Gin Glu lie Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 

420 425 430 

AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA ACC 134 4 

30 Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr 

435 440 445 

TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA GAT 13 92 

Ser Leu Gin Ala ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asp 

35 450 455 460 

TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC GTC 1440 

Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin lie Ser Val 

465 470 475 480 



ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA TTG 14 88 
Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He Leu 
485 490 495 



45 TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GCC TGT GAA GCG CTG GCA 153 6 
Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 
500 505 510 

GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC AAC 158 4 
50 Val Ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 
515 520 525 

GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC ACG 163 2 
Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly Thr 
55 530 535 540 

CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA CAA 168 0 
Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 
545 550 555 560 



GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC TAC 172 8 
Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 
565 570 575 



65 ACC ATT AAA TAA 1740 
Thr He Lys • • • 

579 



70 (2) INFORMATION FOR SEQ ID NO: 51: 
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40 



55 



70 



' i ,' SEQUENCE CHARACTERISTICS: 

(A) LEIJGTH: 5~9 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

fxi) SEQUENCE DESCRIPTION: SEQ ID MO:51 (TcdAiii): 

Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie Asn 
15 10 15 



Glu 7a 1 Met Met Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 
15 20 25 30 

Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Tyr Leu Pro He 

35 40 45 

20 Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 

50 55 60 



Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Met Ser Leu Trp 

65 70 75 80 

Arg Phe Pro His Met Leu Glu Asn Ala Arg Gly Met Val Ser Gin Leu 
85 90 95 



Thr Gin Phe Gly Ser Thr Leu Gin Asn lie He Glu Arg Gin Asp Ala 

30 100 105 110 

Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu lie Leu 

115 120 125 

35 Thr Asn Leu Ser He Gin Asp Lys Thr He Glu Glu Leu Asp Ala Glu 
130 135 140 



Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asp 

145 150 155 160 

Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 

165 170 175 



Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 
45 180 185 190 

Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 
195 200 205 

50 Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala lie Ala Glu Ala Thr 
210 215 220 



Gly Tyr Val Met Glu Phe Ser Ala Asn Val Met Asn Thr Glu Ala Asp 

225 230 235 240 

Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp Glu 
245 250 255 



He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 
60 260 265 270 

Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 
275 280 285 

65 Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 
290 295 300 



Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Arg 
305 310 315 320 
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L=u Ala Ala lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg Cys 
325 330 325 

Leu Met Ala Glu Gin Ala Tyr Arg Trp Giu Leu Asn Asp Asp Ser Ala 
5 340 345 350 

Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Giy Leu Leu 
355 i60 365 

10 Ala Gly Glu Thr Leu Met Leu Ser Leu Ala Gin Met Glu Asp Ala His 
370 375 380 



15 



30 



45 



Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 

335 390 395 400 

Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 

405 410 415 



Ala Gin Glu He Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 
20 420 425 430 

Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr 
435 440 445 

25 Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asp 
450 455 460 



Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He Ser Val 

4o5 470 475 430 

Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He Leu 

485 490 495 



Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 

35 500 505 510 

Val Ser His Gly Met Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 

515 520 525 

40 Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly Thr 

530 535 540 



Leu Thr Leu Ser Phe Pro Asn Ala Ser Met Pro Glu Lys Gly Lys Gin 

545 550 555 560 

Ala Thr Met Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 

565 570 575 



Thr He Lys ••• 

50 579 



(2) INFORMATION FOR SEQ ID NO: 52: 
(i) SEQUENCE CHARACTERISTICS: 
55. <A) LENGTH: 5532 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
<D) TOPOLOGY: linear 



60 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 52 (TcdAiii coding region! 



65 TTT ATA CAA GGT TAT ACT GAT CTG TTT GGT AAT CCT GCT GAT AAC TAT 4 8 

Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 
15 10 15 

CCC GCG CCG GGC TCG GTT GCA TCG ATG TTC TCA CCG GCG GCT TAT TTG 9 6 
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Ala Ala Pro 31y Ser Val Ala Ser Met Phe Ser Pro Aia Ala Tyr Leu 
20 25 30 

ACG GAA TTG TAC CGT GAA GCC AAA AAC TTG CAT GAC AGC AGC TCA ATT 144 
i Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser lie 
35 40 45 

TAT TAC CTA GAT AAA CGT CGC CCG GAT TTA OCA AGC TTA ATG CTC AGC 1?2 
Tyr Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 
I 50 55 60 

CAG AAA AAT ATG GAT GAG GAA ATT TCA ACG CTG GCT CTC TCT AAT GAA 240 
Gin Lys Asn Met Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu 
65 70 75 30 

TTG TGC CTT GCC GGG ATC GAA ACA AAA AC A GGA AAA TCA CAA GAT GAA 283 
Leu Cys Leu Ala Gly He Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 
85 90 95 

GTG ATG GAT ATG TTG TCA ACT TAT CGT TTA ACT GGA GAG ACA CCT TAT 3 36 
Val Met Asp Met Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tyr 
100 105 HO 

CAT CAC GCT TAT GAA ACT GTT CGT GAA ATC GTT CAT GAA CGT GAT CCA 3 84 
His His Ala Tyr Glu Thr Val Arg Glu He Val His Glu Arg Asp Pro 
115 120 125 

GGA TTT CGT CAT TTG TCA CAG GCA CCC ATT GTT GCT GCT AAG CTC GAT 43 2 
Gly Phe Arg His Leu Ser Gin Ala Pro He Val Ala Aia Lys Leu Asp 
130 135 140 

CCT GTG ACT TTG TTG GGT ATT AGC TCC CAT ATT TCG CCA GAA CTG TAT 48 0 
Pro Val Thr Leu Leu Gly He Ser Ser His He Ser Pro Glu Leu Tyr 
145 150 155 160 

AAC TTG CTG ATT GAG GAG ATC CCG GAA AAA GAT GAA GCC GCG CTT GAT 528 
Asn Leu Leu He Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp 
165 170 175 

ACG CTT TAT AAA ACA AAC TTT GGC GAT ATT ACT ACT GCT CAG TTA ATG 57 6 
Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Met 
180 185 190 

TCC CCA AGT TAT CTG GCC CGG TAT TAT GGC GTC TCA CCG GAA GAT ATT 624 
Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp He 
195 200 205 

GCC TAC GTG ACG ACT TCA TTA TCA CAT GTT GGA TAT AGC AGT GAT ATT 67 2 
Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 
210 215 220 

CTG GTT ATT CCG TTG GTC GAT GGT GTG GGT AAG ATG GAA GTA GTT CGT 72 0 
Leu Val lie Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arg 
225 230 235 240 

GTT ACC CGA ACA CCA TCG GAT AAT TAT ACC AGT CAG ACG AAT TAT ATT 76 8 
Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 
245 250 255 

GAG CTG TAT CCA CAG GGT GGC GAC AAT TAT TTG ATC AAA TAC AAT CTA 816 
Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu lie Lys Tyr Asn Leu 
260 265 270 

AGC AAT AGT TTT GGT TTG GAT GAT TTT TAT CTG CAA TAT AAA GAT GGT 864 
Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 
275 280 285 

TCC GCT GAT TGG ACT GAG ATT GCC CAT AAT CCC TAT CCT GAT ATG GTC 912 
Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Met Val 
290 295 300 
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.VTA AAT CAA AAG TAT GAA TCA CAG GCG AC A ATC AAA CGT ACT GAC TCT 3 50 
lie Asn Gin Lys Tyr Glu Ser Gin Ala Thr lie Lys Ara Ser Asp 5<»r 
305 310 315 " 320 

5 C-AC AAT ATA CTC AGT ATA GGG TTA CAA AG A TGG CAT AGC GGT AGT TAT 1003 
Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 
325 330 335 

AAT TTT GCC GCC GCC AAT TTT AAA ATT GAC CAA TAC TCC CCG AAA GCT 1056 
10 .-.sn Phe Ala Ala Ala Asn Phe Lys lie Asp Gin Tyr Ser Pro Lys Ala 
340 345 350 

TTC CTG CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 
Phe Leu Leu Lys Met Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 
15 355 360 365 

CTC TCT TTT GCT ACG TTG GAG CGT ATT GTT GAT AGT GTT AAT AGC ACC 1152 

Leu 3er Phe Ala Thr Leu Glu Arg He Val Asp Ser Val Asn Ser Thr 

370 375 380 

20 

AAA TCC ATC ACG GTT GAG GTA TTA AAC AAG GTT TAT CGG GTA AAA TTC 12 00 

Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 

385 390 395 400 

25 TAT ATT GAT CGT TAT GGC ATC AGT GAA GAG ACA GCC GCT ATT TTG GCT 12 48 
Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 
405 410 415 



40 



60 



AAT ATT AAT ATC TCT CAG CAA GCT GTT GGC AAT CAG CTT AGC CAG TTT 129 6 

30 Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 
420 425 430 

GAG CAA CTA TTT AAT CAC CCG CCG CTC AAT GGT ATT CGC TAT GAA ATC 13 44 

Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu He 
35 435 440 445 

AGT GAG GAC AAC TCC AAA CAT CTT CCT AAT CCT GAT CTG AAC CTT AAA 13 92 

Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 
450 455 460 



CCA GAC AGT ACC GGT GAT GAT CAA CGC AAG GCG GTT TTA AAA CGC GCG 144 0 
Pro Asp Ser Thr Gly Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 
465 470 475 480 



45 TTT CAG GTT AAC GCC AGT GAG TTG TAT CAG ATG TTA TTG ATC ACT GAT 1488 

Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Met Leu Leu He Thr Asp 
485 490 495 

CGT AAA GAA GAC GGT GTT ATC AAA AAT AAC TTA GAG AAT TTG TCT GAT 153 6 

50 Arg Lys Glu Asp Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 

500 505 510 

CTG TAT TTG GTT AGT TTG CTG GCC CAG ATT CAT AAC CTG ACT ATT GCT 1534 

Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 
55 515 520 525 

GAA TTG AAC ATT TTG TTG GTG ATT TCT GGC TAT GGC GAC ACC AAC ATT 163 2 

Glu Leu Asn He Leu Leu Val He Cys Gly Tyr Gly Asp Thr Asn He 
530 535 540 



TAT CAG ATT ACC GAC GAT AAT TTA GCC AAA ATA GTG GAA ACA TTG TTG 16 80 
Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 
545 550 555 560 



65 TGG ATC ACT CAA TGG TTG AAG ACC CAA AAA TGG ACA GTT ACC GAC CTG 1~2 3 

Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 
565 570 575 

TTT CTG ATG ACC ACG GCC ACT TAC AGC ACC ACT TTA ACG CCA GAA ATT 177 6 

70 Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He 
580 585 590 
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AGC AAT CTG ACG GCT ACG TTG TCT TCA ACT TTG CAT GGC AAA GAG AGT 132 4 

«r Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Giu Ser 
595 600 605 

5 

CTG ATT GGG GAA GAT CTG AAA AGA GCA ATG GCG CCT TGC TTC ACT TCG 137 2 

Leu lie Gly Glu Asp Leu Lys Arg Ala Met Ala Pro Cys Phe Thr Ser 
610 615 620 

10 GCT TTG CAT TTG ACT TCT CAA GAA GTT GCG TAT GAC CTG CTG TTG TGG 19 20 

Ala Leu His Leu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 
625 630 635 6<10 

ATA GAC CAG ATT CAA CCG GCA CAA ATA ACT GTT GAT GGG TTT TGG GAA 1963 

15 He Asp Gin lie Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 

645 650 655 

GAA GTG CAA AC A AC A CCA ACC AGC TTG AAG GTG ATT ACC TTT GCT CAG 2016 

Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 

20 660 665 670 

GTG CTG GCA CAA TTG AGC CTG ATC TAT CGT CGT ATT GGG TTA AGT GAA 2 0 64 

Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu 
675 630 685 



25 



45 



65 



ACG GAA CTG TCA CTG ATC GTG ACT CAA TCT TCT CTG CTA GTG GCA GGC 2112 
Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 
690 695 700 



30 AAA AGC ATA CTG GAT CAC GGT CTG TTA ACC CTG ATG GCC TTG GAA GGT 2160 

Lys Ser He Leu Asp His Gly Leu Leu Thr Leu Met Ala Leu Glu Gly 

705 710 715 720 

TTT CAT ACC TGG GTT AAT GGC TTG GGG CAA CAT GCC TCC TTG ATA TTG 2 203 

35 Phe His Thr Trp Val Asn Gly Leu Gly Gin His Ala Ser Leu He Leu 

725 730 " 735 

GCG GCG TTG AAA GAC GGA GCC TTG ACA GTT ACC GAT GTA GCA CAA GCT 2 2 56 

Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 
40 740 745 750 

ATG AAT AAG GAG GAA TCT CTC CTA CAA ATG GCA GCT AAT CAG GTG GAG 23 04 

Met Asn Lys Glu Glu Ser Leu Leu Gin Met Ala Ala Asn Gin Val Glu 
755 760 765 



AAG GAT CTA ACA AAA CTG ACC AGT TGG ACA CAG ATT GAC GCT ATT CTG 23 52 
Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 
770 775 760 



50 CAA TGG TTA CAG ATG TCT TCG GCC TTG GCG GTT TCT CCA CTG GAT CTG 2 4 00 
Gin Trp Leu Gin Met Ser Ser Ala Leu Ala Val Ser Pro Leu Asp Leu 
785 790 795 800 

GCA GGG ATG ATG GCC CTG AAA TAT GGG ATA GAT CAT AAC TAT GCT GCC 2 4 48 
55 Ala Gly Met Met Ala Leu Lys Tyr Gly He Asp His Asn Tyr Ala Ala 

805 810 315 

TGG CAA GCT GCG GCG GCT GCG CTG ATG GCT GAT CAT GCT AAT CAG GCA 2496 
Trp Gin Ala Ala Ala Ala Ala Leu Met Ala Asp His Ala Asn Gin Ala 
60 820 825 830 

CAG AAA AAA CTG GAT GAG ACG TTC AGT AAG GCA TTA TGT AAC TAT TAT 254 4 
Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn Tyr Tyr 
835 840 845 



ATT AAT GCT GTT GTC GAT AGT GCT GCT GGA GTA CGT GAT CCT AAC GGT 2592 
He Asn Ala Val Val Asp Ser Ala Ala Gly Val Arg Asp Arg Asn Gly 
350 855 860 



70 TTA TAT ACC TAT TTG CTG ATT GAT AAT CAG GTT TCT GCC GAT GTG ATC 2 64 0 
Leu Tyr Thr Tyr Leu Leu He Asp Asn Gin Val Ser Ala Asp Val He 
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30 



50 



70 



365 370 375 ioO 

ACT TCA COT ATT GCA GAA GCT ATC GCC GGT ATT CAA CTG TAC GTT AAC 25 A3 

Thr Ser Arg lie Ala Glu Ala He Ala Gly lie Gin Leu Tyr Val Asn 

885 890 395 

CGG GCT TTA AAC CGA GAT GAA GGT CAG CTT GCA TCG GAC GTT AGT ACC 27 3 6 

Arg Ala Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Val Ser Thr 

900 905 910 

CGT CAG TTC TTC ACT GAC TGG GAA CGT TAC AAT AAA CGT TAC AGT ACT 27 3 4 

Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr 

915 920 925 



15 TGG GCT GGT GTC TCT GAA CTG GTC TAT TAT CCA GAA AAC TAT GTT GAT 23 3 2 

Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr Val Asp 
930 935 940 

CCC ACT CAG CGC ATT GGG CAA ACC AAA ATC ATG GAT GCG CTG TTC CAA 2380 

20 Pro Thr Gin Arg He Gly Gin Thr Lys Met Met Asp Ala Leu Leu Gin 

945 950 955 960 

TCC ATC AAC CAG AGC CAG CTA AAT GCG GAT ACG GTC GAA GAT GCT TTC 29 2 3 

Ser lie Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe 

25 965 970 975 

AAA ACT TAT TTG ACC AGC TTT GAG CAG GTA GCA AAT CTG AAA GTA ATT 29 76 

Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 
980 985 990 



AGT GCT TAC CAC GAT AAT GTC AAT GTG GAT CAA GCA TTA ACT TAT TTT 3 024 
Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Gly Leu Thr Tyr Phe 
995 1000 1005 



35 ATC GGT ATC GAC CAA GCA GCT CCG GGT ACG TAT TAC TGG CGT AGT GTT 3072 
He Gly He Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val 
1010 1015 1020 

GAT CAC AGC AAA TCT GAA AAT GGC AAG TTT GCC GCT AAT GCT TGG GGT 3120 
40 Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 
1025 1030 1035 1040 

GAG TGG AAT AAA ATT ACC TCT GCT GTC AAT CCT TGG AAA AAT ATC ATC 3 163 
Glu Trp Asn Lys He Thr Cys Ala Val Asn Pro Trp Lys Asn He He 
45 1045 1050 1055 

CGT CCG GTT GTT TAT ATG TCC CGC TTA TAT CTG CTA TGG CTG GAG CAG 3 216 
Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 
1060 1065 1070 



CAA TCA AAG AAA AGT GAT GAT GGT AAA ACC ACG ATT TAT CAA TAT AAC 3 2 64 
Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr He Tyr Gin Tyr Asn 
1075 1080 1085 



55 TTA AAA CTG GCT CAT ATT CGT TAC GAC GGT AGT TGG AAT AC A CCA TTT 3 312 
Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 
1090 1095 1100 

ACT TTT CAT GTG ACA GAA AAG GTA AAA AAT TAC ACG TCG AGT ACT GAT 3 3 60 
60 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 
1105 1110 1H5 1120 

GCT GCT GAA TCT TTA GGG TTG TAT TGT ACT GGT TAT CAA GGG GAA GAC 3 403 
Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 
65 1125 1130 1135 

ACT CTA TTA GTT ATG TTC TAT TCG ATG CAG AGT AGT TAT AGC TCC TAT 3 45 6 
Thr Leu Leu Val Met Phe Tyr ser Met Gin Ser Ser Tyr Ser Ser Tyr 
1140 1145 1150 



ACC GAT AAT AAT GCG CCG GTC ACT GGG CTA TAT ATT TTC GCT GAT ATG 3 50 4 
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Thr Asp Asn Asn Aia ?ro Val Thr Gly Leu Tyr He Phe Ala Asp Met 
1155 1160 U65 

TCA TCA GAC AAT ATG ACG AAT GCA CAA GCA ACT AAC TAT TGG AAT AAC 35 = 2 
5 Ser Ser Asp Asn Mec Thr Asn Ala Gin Ala Thr Asn Tyr Trp ^sn -sn 
1170 1175 1130 

AGT TAT CCG CAA TTT GAT ACT GTG ATG GCA GAT CCG GAT AGC GAC i AT 3 600 
Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 
10 H35 1190 1195 1200 

AAA AAA GTC ATA ACC AGA AGA GTT AAT AAC CGT TAT GCG GAG GAT TAT 3 64 3 
Lys Lys Val lie Thr Arg Arg Val Asn Asn Arg Tyr Ala Giu Asp Tyr 
1205 1210 1215 

GAA ATT CCT TCC TCT GTG ACA AGT AAC AGT AAT TAT TCT TGG GGT GAT 3 69 6 
Glu He Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asd 
1220 1225 1230 

20 CAC AGT TTA ACC ATG CTT TAT GGT GGT AGT GTT CCT AAT ATT ACT TTT 3744 
His Ser Leu Thr Mec Leu Tyr Gly Gly Ser Val pro Asn He Thr Phe 
1235 1240 1245 

GAA TCG GCG GCA GAA GAT TTA AGG CTA TCT ACC AAT ATG GCA TTG AGT 3 792 
25 Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 
1250 1255 1260 

ATT ATT CAT AAT GGA TAT GCG GGA ACC CGC CGT ATA CAA TCT AAT CTT 3 84 0 
He He His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu 
30 1265 1270 1275 1280 

ATG AAA CAA TAC GCT TCA TTA GGT GAT AAA TTT ATA ATT TAT GAT TCA 3 838 
Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe He He Tyr Asp Ser 
35 1285 1290 1295 

TCA TTT GAT GAT GCA AAC CGT TTT AAT CTG GTG CCA TTG TTT AAA TTC 3 936 
Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 
1300 1305 1310 

40 GGA AAA GAC GAG AAC TCA GAT GAT AGT ATT TGT ATA TAT AAT GAA AAC 3984 
Gly Lys Asp Glu Asn Ser Asp Asp Ser He Cys He Tyr Asn Glu Asn 
1315 1320 1325 

CCT TCC TCT GAA GAT AAG AAG TGG TAT TTT TCT TCG AAA GAT GAC AAT 4 03 2 
45 Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 
1330 1335 1340 

AAA ACA GCG GAT TAT AAT GGT GGA ACT CAA TGT ATA GAT GCT GGA ACC 4080 
Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys He Asp Ala Gly Thr 
50 1345 1350 1355 1360 

AGT AAC AAA GAT TTT TAT TAT AAT CTC CAG GAG ATT GAA GTA ATT AGT 4128 
Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu He Glu Val He Ser 
1365 1370 1375 

GTT ACT GGT GGG TAT TGG- TCG AGT TAT AAA ATA TCC AAC CCG ATT AAT 417 6 
Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys He Ser Asn Pro He Asn 
1380 1385 1390 

60 ATC AAT ACG GGC ATT GAT AGT GCT AAA GTA AAA GTC ACC GTA AAA GCG 422 4 
He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 
1395 1400 1405 

GGT GGT GAC GAT CAA ATC TTT ACT GCT GAT AAT AGT ACC TAT GTT CCT 4 27 2 
Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 
1410 U15 1420 

CAG CAA CCG GCA CCC AGT TTT GAG GAG ATG ATT TAT CAG TTC AAT AAC 4 320 
Gin Gin Pro Ala Pro Ser Phe Glu Glu Mec He Tyr Gin Phe Asn Asn 
70 -H25 1430 1435 1440 
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CTG ACA ATA CAT TGT AAG AAT TTA AAT TTC ATC G.-.C AAT C AG GCA CAT -12 ci 

Liu Thr lie Asp Cys Lys Asn Leu Asn Phe lie Asp Asn Gin Ala His 

1445 1450 1455 

5 ATT GAG ATT GAT TTC ACC GCT ACG GCA CAA GAT GGC CGA TTC TTG CGT 44 16 

lie Glu lie Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 
1460 1465 1470 

GCA GAA ACT TTT ATT ATC CCG GTA ACT AAA AAA GTT CTC GGT ACT GAG 4 4 64 

10 Ala Glu Thr Phe lie He Pro Val Thr Lys Lys Val Leu Gly Thr Glu 
1475 1480 1485 

AAC GTG ATT GCG TTA TAT AGC GAA AAT AAC GGT GTT CAA TAT ATG CAA 4512 

Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 
15 1490 1495 1500 

ATT GGC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTG 45 6 0 

He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 
1505 1510 1515 1520 



20 



40 



60 



GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA GTG CTC AGT ATG GAA ACT 46 08 
Val Ser Arg Ala Asn Arg Gly He Asp Ala Val Leu Ser Met Glu Thr 
1525 1530 1535 



25 CAG AAT ATT CAG GAA CCG CAA TTA GGA GCG GGC ACA TAT GTG CAG CTT 4656 
Gin Asn He Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 
1540 1545 1550 

GTG TTG GAT AAA TAT GAT GAG TCT ATT CAT GGC ACT AAT AAA AGC TTT 47 04 
30 Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 
1555 1560 1565 

GCT ATT GAA TAT GTT GAT ATA TTT AAA GAG AAC GAT AGT TTT GTG ATT 47 52 
Ala He Glu Tyr Val Asp He Phe Lys Glu Asn Asp Ser Phe Val He 
35 1570 1575 1580 

TAT CAA GGA GAA CTT AGC GAA ACA AGT CAA ACT GTT GTG AAA GTT TTC 4300 
Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 
1585 1590 1595 1600 



TTA TCC TAT TTT ATA GAG GCG ACT GGA AAT AAG AAC CAC TTA TOG GTA 4848 
Leu Ser Tyr Phe He Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 
1605 1610 1615 



45 CGT GCT AAA TAC CAA AAG GAA ACG ACT GAT AAG ATC TTG TTC GAC CGT 4396 
Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Arg 
1620 1625 1630 

ACT GAT GAG AAA GAT CCG CAC GGT TGG TTT CTC AGC GAC GAT CAC AAG 49 4 4 
50 Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 
1635 1640 1645 

ACC TTT AGT GGT CTC TCT TCC GCA CAG GCA TTA AAG AAC GAC AGT GAA 49 92 
Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 
55 1650 1655 1660 

CCG ATG GAT TTC TCT GGC GCC AAT GCT CTC TAT TTC TGG GAA CTG TTC 5040 
Pro Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 
1665 1670 1675 1680 



TAT TAC ACG CCG ATG ATG ATG GCT CAT CGT TTG TTG CAG GAA CAG AAT 508 8 
Tyr Tyr Thr Pro Mec Met Met Ala His Arg Leu Leu Gin Glu Gin Asn 
1685 1690 1695 



65 TTT GAT GCG GCG AAC CAT TGG TTC CGT TAT GTC TGG AGT CCA TCC GGT 513 6 
Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 
1700 1705 1710 

TAT ATC GTT GAT GGT AAA ATT GCT ATC TAC CAC TGG AAC GTG CGA CCG 5134 
70 Tyr He Val Asp Gly Lys He Ala He Tyr His Trp Asn Val Arg Pro 
1715 1720 1725 
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15 



20 



30 



CTG GAA GAA GAC ACC AGT TGG AAT GCA CAA CAA CTG GAC TCC ACC GAT 52.:: 
Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 
1730 1735 1740 

CC^ GAT GCT GTA GCC CAA GAT GAT CCG ATG CAC TAC AAG GTG GCT ACC 52 30 
Pro Asp Ala Val Ala Gin Asp Asp Pro Mec His Tyr Lys Val Ala Thr 
1745 1750 1755 1760 

TTT ATG GCG ACG TTG GAT CTG CTA ATG GCC CGT GGT GAT GCT GCT TAC 5 3 23 
Phe Mec Ua Thr Leu Asp Leu Leu Mec Ala Arg Gly Asp Ala Ala Tyr 
1765 1770 1775 

CGC CAG TTA GAG CGT GAT ACG TTG GCT GAA GCT AAA ATG TGG TAT AC A 53 7 6 
Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Mec Trp Tyr Thr 
1780 1785 1790 

CAG GCG CTT AAT CTG TTG GGT GAT GAG CCA CAA GTG ATG CTG AGT ACG 542-1 
Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Met Leu Ser Thr 
1795 1800 1805 

ACT TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAG 5472 
Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 
1310 1815 1820 

CAG CTT CGT CAG CAA GTG CTT ACC CAG TTG CGT CTC AAT AGC AGG GTA 5 52 0 
Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 
1825 1830 1835 1340 



AAA ACC CCG TTG 
Lys Thr Pro Leu 
1844 



5532 



35 (2) INFORMATION FOR SEQ ID NO: 53: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1844 amino acids 

(B) TYPE: amino acids 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



45 



50 



55 



60 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 (TcbAii 

Feacures From To 

Peptide 1 1844 

Fragment 1 11 

Fragment 978 990 

Fragment 1387 1401 

Fragment 1484 1505 

Fragment 1527 1552 



Description 
TcbAii pepcide 
(SEQ ID NO:l) 
(SEQ ID NO: 23) 
(SEQ ID NO: 22) 
(SEQ ID NO: 24) 
( SEQ ID NO: 21) 



Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 
15 10 l^ 

Ala Ala Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala Tyr Leu 
20 25 30 

Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser He 
35 40 45 

Tyr Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 
50 55 60 

Gin Lys Asn Mec Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu 

65 70 75 30 

Leu cys Leu Ala Gly He Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 
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35 90 )l 

Val M c Asp Met Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tvr 
100 105 no 

5 

His His Ala Tyr Glu Thr Val Arg Glu lie Val His Glu Arg Asp Pro 
115 120 125 

Gly Phe Arg His Leu Ser Gin Ala Pro lie Val Ala Ala Lys Leu Asd 
10 130 135 140 

Pro Val Thr Leu Leu Gly lie Ser Ser His He Ser Pro Glu Leu Tt 
^ 5 150 155 160 

15 Asn Leu Leu He Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu ksp 

165 170 175 



20 



35 



50 



65 



Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Met 
180 185 190 

Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp He 
195 200 205 



Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 
25 210 215 220 

Leu Val He Pro Leu Val Asp Gly Val Gly Lys Met Glu Val Val Arq 
225 230 235 240 

30 Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 

245 250 255 



Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu He Lys Tyr Asn Leu 
260 265 270 

Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 
275 280 285 



Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Met Val 
40 290 295 300 

He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arg Ser Asp Ser 
305 310 315 320 

45 Asp Asn lie Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 

325 330 335 



Asn Phe Ala Ala Ala Asn Phe Lys lie Asp Gin Tyr Ser Pro Lys Ala 
340 345 350 

Phe Leu Leu Lys Met Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 
355 360 365 



Leu Ser Phe Ala Thr Leu Glu Arg He Val Asp Ser Val Asn Ser Thr 

55 370 375 330 

Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 

385 390 395 400 

60 Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 

405 410 415 



Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 
420 425 430 

Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu He 
435 440 445 



Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 
70 450 455 460 
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Pro Asp Ser Thr Ziy Asp Asp Gin Arg Lys Ala Val L=u Lys Arg Ala 
465 4"0 475 430 

Phe Gin Val Asn Aid Ser Glu Leu Tyr Gin Met Leu Leu He Thr Asp 
5 485 490 495 

Arg Lys Glu Asp Gly Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 
500 505 510 

!() Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 
515 520 525 



15 



30 



45 



60 



Glu Leu Asn He Leu Leu Val He Cys Gly Tyr Gly Asp Thr Asn lie 

530 535 540 

Tyr Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 

545 550 555 560 



Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 

20 565 570 575 

Phe Leu Met Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He 

580 585 590 

25 Ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser 

595 600 605 



Leu He Gly Glu Asp Leu Lys Arg Ala Met Ala Pro Cys Phe Thr Ser 

610 615 620 

Ala Leu His Leu Thr ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 

625 630 635 640 



He Asp Gin He Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 
35 645 650 655 

Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 
660 665 670 

40 Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu 
675 680 685 



Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 

690 695 700 

Lys Ser He Leu Asp His Gly Leu Leu Thr Leu Met Ala Leu Glu Gly 

705 710 715 720 



Phe His Thr Trp Val Asn Gly Leu Gly Gin His Ala Ser Leu He Leu 
50 725 730 735 

Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 
740 745 "50 

55 Met Asn Lys Glu Glu Ser Leu Leu Gin Met Ala Ala Asn Gin Val Glu 

755 760 765 



Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 

770 775 780 

Gin Trp Leu Gin Met Ser Ser Ala Leu Ala Val Ser Pro Leu Asp Leu 

735 790 795 800 



Ala Gly Met Met Ala Leu Lys Tyr Gly He Asp His Asn Tyr A.la Ala 
65 305 810 315 

Trp Gin Ala Ala Ala Ala Ala Leu Met Ala Asp His Ala Asn Gin Ala 

820 825 330 

70 Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn Tyr Tyr 
335 840 845 
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lie Asn Ala Val Val Asp Ser Ala Ala Gly 7a 1 Arg Asp Arg Asn Gly 
350 855 360 

Leu Tyr Thr Tyr Leu Leu lie Asp Asn Gin Val Ser Ala Asp Val lie 
365 370 375 330 

Thr Ser Arg lie Ala Glu Ala He Ala Gly He Gin Leu Tyr Val Asn 
385 890 395 

Arg Ala Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Val ser Thr 
900 905 910 

Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr 
13 915 920 925 

Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr Val Asd 
930 935 940 

20 Pro Thr Gin Arg He Gly Gin Thr Lys Mec Met Asp Ala Leu Leu Gin 
945 950 955 9 60 

Ser He Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe 
^ 965 970 975 

Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 
980 985 990 

Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Gly Leu Thr Tyr Phe 
30 995 iOOO 1005 

He Gly He Asp Gin Aia Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val 
1010 1015 1020 

35 Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 
1025 1030 1035 1040 

Glu Trp Asn Lys He Thr Cys Ala Val Asn Pro Trp Lys Asn He He 
1045 1050 1055 

Arg Pro Val Val Tyr Met Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 
1060 1065 1070 

Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr He Tyr Gin Tyr Asn 
•+5 1075 1080 1085 

Leu Lys Leu Ala His He Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 
1090 1095 1100 

50 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 
HO 5 1110 1115 1120 

Ala Ala Glu Ser Leu Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 
55 H25 U30 1135 

Thr Leu Leu Val Met Phe Tyr Ser Met Gin Ser Ser Tyr Ser Ser Tyr 
1140 H45 1150 

Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr He Phe Ala Asp Met 
n " 1155 H60 1165 

Ser ser Asp Asn Met Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn 
1170 H75 1130 

65 Ser Tyr Pro Gin Phe Asp Thr Val Met Ala Asp Pro Asp Ser Asp Asn 
1135 H90 1195 1200 

Lys Lys Val He Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr 
1205 1210 1215 



40 



70 



Glu lie Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly Asp 
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1220 1225 1230 

His Ser Leu Thr Met Leu Tyr Gly Gly Ser Val Pro Asn lie Thr Phe 
1235 1240 1245 

5 

Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Met Ala Leu Ser 
1250 1255 1260 

He He His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu 
10 1265 1270 1275 1230 

Met Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe He He Tyr Asp Ser 
1285 1290 1295 

15 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 
1300 1305 1310 



20 



35 



50 



Gly Lys Asp Glu Asn Ser Asp Asp Ser He Cys He Tyr Asn Glu Asn 
1315 1320 1325 

Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 
1330 1335 1340 



Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cvs He Asp Ala Gly Thr 
25 1345 1350 1355 1360 

Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu He Glu Val He Ser 
1365 1370 1375 

30 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys He Ser Asn Pro He Asn 
1380 1385 1390 



He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 
1395 1400 1405 

Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 

1410 1415 1420 



Gin Gin Pro Ala Pro Ser Phe Glu Glu Met He Tyr Gin Phe Asn Asn 
40 1425 1430 1435 1440 

Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His 
1445 1450 1455 

45 He Glu He Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 
1460 1465 1470 



Ala Glu Thr Phe He He Pro Val Thr Lys Lys Val Leu Gly Thr Glu 
1475 1480 1485 

Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 
1490 1495 1500 



He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 
55 1505 1510 1515 1520 

Val Ser Arg Ala Asn Arg Gly He Asp Ala Val Leu Ser Met Glu Thr 
1525 1530 1535 

60 Gin Asn He Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 
1540 1545 1550 

Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 
1555 1560 1565 

65 

Ala He Glu Tyr Val Asp He Phe Lys Glu Asn Asp Ser Phe Val He 
1570 1575 1580 

Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 
70 1585 1590 1595 1600 
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Leu Ser Tyr Phe He Glu Aia Thr Giy Asn Lys Asn His Leu Trp Vai 
1605 1610 1615 

Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys lie Leu Phe Asp Arg 
5 1620 1625 1630 

Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 
1635 1640 1645 

10 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 
1650 1655 1660 



15 



Pro Mec Asp Phe Ser Gly Ala Asn Ala Leu Tyr phe Trp Glu Leu Phe 
1665 1670 1675 1680 

Tyr Tyr Thr Pro Mec Mec Mec Ala His Arg Leu Leu Gin Glu Gin Asn 
1685 1690 1695 



Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp ser Pro Ser Gly 
20 1*700 1705 1710 

Tyr He Val Asp Gly Lys He Ala lie Tyr His Trp Asn Val Arg Pro 
1715 1720 1725 

25 Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 
1730 1735 1740 



30 



Pro Asp Ala Val Ala Gin Asp Asp Pro Met His Tyr Lys Val Ala Thr 
1745 1750 1755 1760 

Phe Mec Ala Thr Leu Asp Leu Leu Mec Ala Arg Gly Asp Ala Ala Tyr 
1765 1770 1775 



Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Mec Trp Tyr Thr 
35 1780 1785 1790 

Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Mec Leu Ser Thr 
1795 1800 1805 

40 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 
1810 1815 1820 



45 



Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 
1825 1830 1835 1840 

Lys Thr Pro Leu 
1844 



50 (2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1722 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

60 <xi) SEQUENCE DESCRIPTION: SEQ ID NO:54 (TcbAiix coding region 

CTA GGA ACA GCC AAT TCC CTG ACC GCT TTA TTC CTG CCG CAG GAA AAT 4 5 
Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 
15 10 15 



65 



AGC AAG CTC AAA GGC TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT 9 6 
Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Mec Phe Asn 
20 25 30 
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TTA CGT CAT AAT CTG TCC ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG 144 

Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Pro Leu 
35 40 45 

5 TAT GCT AAA CCG GCT GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA L3Z 

Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 
50 55 60 

CCT TCT CAA GGG GGA GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC 2 40 

10 Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr He His 
65 70 75 30 

CGC TTC CCT CAA ATG CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT 233 

Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 
15 85 90 95 



20 



25 



40 



60 



70 



ATA CAG TTC GGT AGT TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG 33 6 
He Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 
100 105 no 

GAA GCT ATG AGT CAA CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG 3 34 
Glu Ala Met Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu 
115 120 125 

ACC AGT ATT CGT ATG CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA 432 
Thr Ser He Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 
130 135 140 



AAA ACC GCC TTG CAA GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC 480 
30 Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asd 

150 155 160 

AGC TAT AGC CAA CTG TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA 523 
Ser Tyr Ser Gin Leu Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg 
35 165 170 175 



GCG CTG GCG TTA CGC TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG 57 6 
Ala Leu Ala Leu Arg Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin 
180 185 190 

ATT TCC CGT ATG GCA GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC 624 
He Ser Arg Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn He Phe 
195 200 205 



45 GGC CTG GCT GAT GGC GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC 672 
Gly Leu Ala Asp Gly Gly Met His Tyr Gly Ala He Ala Tyr Ala lie 
210 215 220 

GCT GAC GGT ATT GAG TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG 720 
50 Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu 
225 230 235 240 

AAA GTT GCT CAG TCG GAA ATA TAT CGC CGT CGC CGT CAA GAA TCG AAA 7 68 
Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 
55 245 250 255 



ATT CAG CGT GAC AAC GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA 816 
He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 
260 265 270 

CTG GAA TCA CTG TCT ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG 864 
Leu Glu Ser Leu Ser He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu 
275 280 285 



65 TAC CTG AAA ACC CAG CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA 912 
Tyr Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 
290 295 300 



AGA AGC AAA TTC AGT AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT 960 
Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 
J 05 310 315 320 
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TTC TCA 
Leu Ser 

5 

CTG ATG 
Leu Met 



10 AGC TTT 
Ssr Phe 



TGT GGA 
15 Cys Gly 
370 

CTG AAA 
Leu Lys 
20 335 

GCA GTG 
Ala Val 

25 

GAA CAA 
Glu Gin 



30 AAA GAA 

Lys Glu 



AAA TTG 
35 Lys Leu 
4S0 

GGT AGC 
Gly Ser 
40 465 

GCA TTG 
Ala Leu 

45 

GGC ACT 
Gly Ser 



50 GGT ACC 
Gly Thr 



TAC CTG 
55 Tyr Leu 
530 

CTT CAA 
Leu Gin 
60 54 5 

ATG AGC 
Met Ser 

65 



GGT ATT TAT 
Gly He Tyr 
325 

GCA GAG CAA 
Ala Glu Gin 
340 

GTC AAA CCG 
val Lys Pro 
355 

GAA GCT TTG 
Glu Ala Leu 



TGG GAA TCT 
Trp Glu Ser 



GTT TAT GAT 
Val Tyr Asp 
405 

ATA CCT GCA 
lie Pro Ala 
420 

AAT GGG TTA 
Asn Gly Leu 
435 

TCC GAC TTG 
Ser Asp Leu 



AAC AAG GTT 
Asn Lys Val 



GTT GGG CCT 
Val Gly Pro 
485 

ACT CAA TTG 
Thr Gin Leu 
500 

AAT GAT AGT 
Asn Asp Ser 
515 

CCA TTT GAA 
Pro Phe Glu 



TTT CCG AAT 
Phe Pro Asn 



GAT ATT ATT 
Asp lie lie 
565 



TTC CAG TTC 
phe Gin Phe 



TCC TAT CAA 
Ser Tyr Gin 



GGT GCA TGG 
Gly Ala Trp 
3 60 

ATA CAA AAT 
He Gin Asn 
375 

CGC GCT TTG 
Arg Ala Leu 
390 

TCA CTG GAA 
Ser Leu Glu 



TTA TTG GAT 
Leu Leu Asp 



TCA TTG GCT 
Ser Leu Ala 
440 

AAA CTG GGA 
Lys Leu Gly 
455 

CGT CGT ATT 
Arg Arg He 
470 

TAT CAG GAT 
Tyr Gin Asp 



CCG AAA GGT 
Pro Lys Gly 



GGT CAG TTC 
Gly Gin Phe 
520 

GGT ATT GCT 
Gly He Ala 
535 

GCT ACC GAC 
Ala Thr Asp 
550 

TTG CAT ATT 
Leu His He 



TAT GAC TTG 
Tyr Asp Leu 
330 

TGG GAA GCT 
Trp Glu Ala 
345 

CAA GGA ACT 
Gin Gly Thr 



CTG GCA CAA 
Leu Ala Gin 



GAA GTA GAA 
Glu Val Glu 
395 

GGT AAT GAT 
Gly Asn Asp 
410 

AAG GGG GAG 
Lys Gly Glu 
425 

AAT GCT ATC 
Asn Ala He 



ACG GAT TAT 
Thr Asp Tyr 



AAG CAA ATC 
Lys Gin He 
475 

GTT CAG GCT 
Val Gin Ala 
490 

TGT TCA GCG 
Cys Ser Ala 
505 

CAG TTG GAT 
Gin Leu Asp 



CTT GAT GAT 
Leu Asp Asp 



AAG CAG AAA 

Lys Gin Lys 
555 

CGT TAT ACC 
Arg Tyr Thr 
570 



GCC GTA TCA 
Ala Val Ser 



AAT GAT AAT 
Asn Asp Asn 
350 

TAC GCC GGC 
Tyr Ala Gly 
365 

ATG GAA GAG 
Met Glu Glu 
380 

CGC ACG GTT 
Arg Thr Val 



CGT TTT AAT 
Arg Phe Asn 



GGA AC A GCA 
Gly Thr Ala 
430 

CTG TCA GCT 
Leu Ser Ala 
445 

CCA GAC AGT 
Pro Asp Ser 
460 

AGT GTT TCG 
Ser Val Ser 



ATG CTC AGC 
Met Leu Ser 



TTG GCT GTG 
Leu Ala Val 
510 

TTC AAT GAC 
Phe Asn Asp 
525 

CAG GGT ACA 
Gin Gly Thr 
540 

GCA ATA TTG 
Ala He Leu 



ATC CGT TAA 
He Arg ••• 

57 3 



CGT TGC 1CG3 
Arg Cys 
335 

TCC ATT 105o 
S r He 



TTA TTG 1104 
Leu Leu 



GCA TAT 1152 
Ala Tyr 



TCA TTG 1200 
Ser Leu 
400 

TTA GCG 124 8 
Leu Ala 
415 

GGA ACT 1296 
Gly Thr 



TCG GTC 1344 
Ser Val 



ATC GTT 1392 
He Val 



CTA CCT 1440 
Leu Pro 
480 

TAT GGT 1488 
Tyr Gly 
495 

TCT CAT 153 6 
Ser His 



GGC AAA 1584 
Gly Lys 



CTG AAT 1632 
Leu Asn 



CAA ACT 1630 
Gin Thr 
560 

1722 



(2) INFORMATION FOR SEQ ID NO: 55: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 3 amino acids 
70 (B) TYPE: amino acids 
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(C! STP.ANDEDNESS: single 
D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

5 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO:55 (TcbAiij.): 

Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 
10 1 5 10 15 

Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Met Phe Asn 
20 25 30 

15 Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Pro Leu 
35 40 45 



20 



35 



50 



65 



Tyr Ala Lys Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser 
50 55 60 

Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His 
65 70 75 80 



Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 
25 85 90 95 

lie Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 
100 105 110 

30 Glu Ala Met Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu lie Leu 
115 120 125 



Thr Ser lie Arg Met Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 
130 135 140 

Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 

145 150 155 160 



Ser Tyr Ser Gin Leu Tyr Glu Glu Asn lie Asn Ala Gly Glu Gin Arg 
40 165 170 175 

Ala Leu Ala Leu Arg Ser Glu Ser Ala lie Glu Ser Gin Gly Ala Gin 
180 185 190 

45 He Ser Arg Met Ala Gly Ala Gly Val Asp Met Ala Pro Asn He Phe 
195 200 205 



Gly Leu Ala Asp Gly Gly Met His Tyr Gly Ala He Ala Tyr Ala He 

210 215 220 

Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Met Val Asp Ala Glu 

225 230 235 240 



Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 
55 245 250 255 

He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 
260 265 270 

60 Leu Glu Ser Leu Ser He Arg Arg Glu Ala Ala Glu Met Gin Lys Glu 
275 280 285 



T/r Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 

290 295 300 

Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 

305 310 315 320 



Leu Ser Gly He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys 
70 325 330 335 
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Leu Met Ala Glu Gin Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser lie 
340 345 350 

5 Ser Phe Val Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 
355 360 365 

Cys Gly Glu Ala Leu lie Gin Asn Leu Ala Gin Met Glu Glu Ala Tyr 
370 375 330 

10 

Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu 
335 390 3S5 400 

Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala 
15 405 410 415 

Glu Gin lie Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 
420 425 430 

20 Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala lie Leu Ser Ala Ser Val 
435 440 445 



25 



40 



Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser lie Val 

450 455 460 

Gly ser Asn Lys Val Arg Arg lie Lys Gin He Ser Val Ser Leu Pro 

465 470 475 480 



Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Mec Leu Ser Tyr Gly 
30 485 490 495 

Gly Ser Thr Gin Leu Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His 
500 505 510 

35 Gly Thr Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys 
515 520 525 



Tyr Leu pro phe Glu Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn 

530 535 540 

Leu Gin Phe Pro Asn Ala Thr Asp Lys Gin Lys Ala lie Leu Gin Thr 

545 550 555 560 



Mec Ser Asp He He Leu His He Arg Tyr Thr He Arg 
45 565 570 573 



(2) INFORMATION FOR SEQ ID NO: 56 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 2898 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56 (cccA) 
60 1 ATG AAT CAA CTC GCC AGT CCC CTG ATT TCC CGC ACC GAA GAG ATC CAC 



43 



1 Met Asn Gin Leu Ala Ser Pro Leu He Ser Arg Thr Glu Glu lie His 16 

4 9 AAC TTA CCC GGT AAA TTG ACC GAT CTT GGT TAT ACC TCA GTG TTT GAT 9 6 
65 17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 22 

97 GTG GTA CGT ATG CCG CGT GAG CGT TTT ATT CGT GAG CAT CGT GCT GAT 144 
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33 Val Val Arg Met Pro Arg Glu Arg Phe He Arg Glu His Arg Ala Asp -iS 

145 CTC GGG CGC AGT GCT GAA AAA ATG TAT GAC CTG GCA GTG GGC TAT GCT 192 
5 49 Leu Giy Arg Ser Ala Glu Lys Met Tyr Asp Leu Ala Val Gly Tyr Ala 64 



10 



15 



30 



35 



50 



55 



19 3 CAT CAG GTG TTA CAC CAT TTT CGC CGT AAT TCT CTT AGT GAA GCT GTT 2 40 

65 His Gin Val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala Val 30 

241 CAG TTT GGC TTG AGA AGT CCG TTC TCC GTA TCA GGC CCG GAT TAC GCC 2 33 

31 Gin Phe Gly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 9 6 

289 AAT CAG TTT CTT GAT GCA AAC ACG GGT TGG AAA GAT AAA GCA CCA AGT 3 36 

97 Asn Gin Phe Leu Asp Ala Asn Thr Gly Trp Lys Asp Lys Ala Pro Sir 112 



20 3 37 GGA TCA CCG GAA GCC AAT GAT GCG CCG GTA GCC TAT CTG ACT CAT ATT 334 

113 Gly Ser Pro Glu Ala Asn Asp Ala Pro Val Ala Tyr Leu Thr His He 123 

335 TAT CAA TTG GCC CTT GAA CAG GAA AAG AAT GGC GCC ACT ACC ATT ATG 43 2 

25 129 Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr He Met 144 



433 AAT ACG CTG GCG GAG CGT CGC CCC GAT CTG GGT GCT TTG TTA ATT AAT 480 

145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Leu Gly Ala Leu Leu He Asn 160 

431 GAT AAA GCA ATC AAT GAG GTG ATA CCG CAA TTG CAG TTG GTC AAT GAA 528 

161 Asp Lys Ala He Asn Glu Val He Pro Gin Leu Gin Leu Val Asn Glu 176 

529 ATT CTG TCC AAA GCT ATT CAG AAG AAA CTG AGT TTG ACT GAT CTG GAA 57 6 

177 He Leu Ser Lys Ala He Gin Lys Lys Leu Ser Leu Thr Asp Leu Glu 192 



40 577 GCG GTA AAC GCC AGA CTT TCC ACT ACC CGT TAC CCG AAT AAT CTG CCG 6 24 

193 Ala Val Asn Ala Arg Leu Ser Thr Thr Arg Tyr Pro Asn Asn Leu Pro 208 

625 TAT CAT TAT GGT CAT CAG CAG ATT CAG ACA GCT CAA TCG GTA TTG GGT 6? 2 

45 209 Tyr His Tyr Gly His Cln Gin He Gin Thr Ala Gin Ser Val Leu Gly 224 



67 3 ACT ACG TTG CAA GAT ATC ACT TTG CCA CAG ACG CTG GAT CTG CCG CAA 7 20 

225 Thr Thr Leu Gin Asp He Thr Leu Pro Gin Thr Leu Asp Leu Pro Gin 240 

721 AAC TTC TGG GCA ACA GCA AAA GGA AAA CTG AGC GAT ACG ACT GCC AGT 7 63 

241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala Ser 256 

7 69 GCT TTG ACC CGA CTG CAA ATC ATG GCG AGT CAG TTT TCG CCA GAG CAG 316 

257 Ala Leu Thr Arg Leu Gin He Met Ala Ser Gin Phe Ser Pro Glu Gin 272 



60 817 CAG AAA ATC ATT ACG GAG ACT GTC GGT CAG GAT TTC TAT CAG CTT AAC 364 

273 Gin Lys He He Thr Glu Thr Val Gly Gin Asp Phe Tyr Gin Leu Asn 233 

365 TAT GGT GAC AGT TCG CTT ACT GTG AVT AGT TTC AGC GAC ATG ACC ATA 912 

65 289 Tyr Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Mec Thr He 304 
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5 



0 



5 



0 



5 



0 



5 



9 13 ATG ACT GAT CGA AC A ACT TTG ACT GTA CCC CAG GTA GAA CTG AT Z TTj 

305 Mec Thr Asp Arg Thr Ssr Leu Thr Val Pro Gin Val Clu L^u Mec icu 

9£1 TGT TCA ACT GTC GGA GGT TCT ACG GTT GTT AAG TCT GAT AAT CTG ACT 

321 Cys Ser Thr Val Gly Gly Ser Thr Val Val Lys Ser Asp Asn Val Ser 



0 1009 TCT GGT GAC ACG ACA GCG ACG CCA TTT GCG TAT GGC GCC CGC TTT ATT 

337 3er Gly Asp Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe lie 

105*7 CAT GCC GGT AAG CCG GAG GCG ATT ACC CTG ACT CGC AGT GGT GCG GAG 

5 353 His Ala Gly Lys Pro Glu Ala lie Thr Leu Ser Arg Ser Gly Ala clu 



1105 GCG CAT TTT GCT CTG ACG GTT AAC AAT CTG ACA GAT GAC AAG TTG GAC 

359 Ala His Phe Ala Leu Thr Val Asn Asn Leu Thr Asp Asp Lys Leu Asp 

115 3 CGT ATT AAC CGC ACA GTG CGC CTG CAA AAA TGG CTG AAT CTG CCT TAT 

335 Arg lie Asn Arg Thr Val Arg Leu Gin Lys Trp Leu Asn Leu Pro Tyr 

12 01 GAG GAT ATT GAC CTG TTA GTG ACT TCT GCT ATG GAT GCG GAA ACA GGA 

401 Glu Asp lie Asp Leu Leu Val Thr Ser Ala Mec Asp Ala Glu Thr Gly 



0 1249 AAT ACC GCG CTG TCG ATG AAC GAC AAT ACG CTG CGT ATG TTG GGA GTG 

417 Asn Thr Ala Leu Ser Mec Asn Asp Asn Thr Leu Arg Mec Leu Gly Val 

1297 TTC AAA CAT TAT CAG GCG AAG TAT GGT GTT AGC GCT AAA CAA TTT GCT 

5 433 Phe Lys His Tyr Gin Ala Lys Tyr Gly Val Ser Ala Lys Gin Phe Ala 



13 45 GGC TGG CTG CGC GTA GTG GCC CCG TTT GCC ATT ACA CCG GCA ACG CCG 

449 Gly Trp Leu Arg Val Val Ala Pro Phe Ala lie Thr Pro Ala Thr Pro 

1393 TTT TTA GAC CAA GTG TTT AAC TCC GTC GGC ACC TTT GAT ACA CCG TTT 

465 Phe Leu Asp Gin Val Phe Asn Ser Val Giy Thr Phe Asp Thr Pro Phe 

1441 GTG ATA GAT AAT CAG GAT TTT GTC TAT ACA TTG ACC ACC GGG GGC GAT 

481 Val lie Asp Asn Gin Asp Phe Val Tyr Thr Leu Thr Thr Gly Gly Asp 



0 1439 GGG GCG CGT GTT AAG CAT ATC AGC ACG GCA CTG GGC CTC AAT CAT CGT 

497 Gly Ala Arg Val Lys His lie Ser Thr Ala Leu Gly Leu Asn His Arg 

153 " CAG TTC CTG TTA TTG GCG GAT AAT ATT GCC CGT CAA CAG GGG AAT GTC 

5 513 Gin Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 



1535 ACG CAA AGC ACA CTC AAC TGT AAT CTG TTT GTG GTG TCA GCT TTC TAC 
529 Thr Gin Ser Thr Leu Asn Cys Asn Leu Phe Val Val Ser Ala Phe Tyr 

163 3 CGT CTG GCT AAT TTG GCG CGC ACA TTG GGG ATA AAT CCA GAG TCT TTC 
545 Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Ser Ph= 

1531 TGT GCC TTG GTT GAT CGA TTA GAT GCA GGT ACA GGC ATC GTC TGG CAG 
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Sol Cys Ala Leu Val Asp Arg Leu Asp Ala Gly Thr Gly He Val Trp jIh = "-; 

17 29 CAA TTG GCA GGG AAA CCC ACA ATC ACG GTA CCA CAA AAA GAT TCC CCG 

5 577 Gin Leu Ala Gly Lys Pro Thr He Thr Val Pro Gin Lys Asp Ser Pro 'ziZ 

177? CTG GCG GCG GAT ATT CTG AGT TTG CTG CAA GCG CTA AGT GCG ATT GCT 
1324 

0 593 Leu Ala Ala Asp lie Leu Ser Leu Leu Gin Ala Leu Ser Ala He Ala 
608 

1825 CAA TGG CAA CAA CAG CAC GAT TTA GAA TTT TCA GCA CTG CTT TTG CTG 137 2 

5 609 Gin Trp Gin Gin Gin His Asp Leu Glu Phe Ser Ala Leu Leu Leu Leu 624 



137 3 TTG AGT GAC AAC CCT ATT TCT ACC TCG CAG GGC ACT GAC GAT CAA TTG 192 0 
625 Leu Ser Asp Asn Pro Tie Ser Thr Ser Gin Gly Thr Asp Asp Gin Leu 640 



1921 AAC TTT ATC CGT CAA GTG TGG CAG AAC CTA GGC AGT ACG TTT CTG GGT 1963 
641 Asn Phe lie Arg Gin Val Trp Gin Asn Leu Gly Ser Thr Phe Val Gly 656 



IS 69 GCA ACA TTG TTG TCC CGC AGT GGG GCA CCA TTA GTC GAT ACC AAC GGC 2016 
657 Ala Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Val Asp Thr Asn Gly 672 



2 017 CAC GCT ATT GAC TGG TTT GCT CTG CTC TCA GCA GGT AAT AGT CCG CTT 2064 
673 His Ala He Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 688 



2 065 ATC GAT AAG GTT GGT CTG GTG ACT GAT GCT GGC ATA CAA AGT GTT ATA 2112 
639 He Asp Lys Val Gly Leu Val Thr Asp Ala Gly He Gin Ser Val He 704 



2113 GCA ACG GTG GTC AAT ACA CAA AGC TTA TCT GAT GAA GAT AAG AAG CTG 2160 
705 Ala Thr Val Val Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720 



2161 GCA ATC ACT ACT CTG ACT AAT ACG TTG AAT CAG GTA CAG AAA ACT CAA 2203 
721 Ala He Thr Thr Leu Thr Asn Thr Leu Asn Gin Val Gin Lys Thr Gin 7 36 



2209 CAG GGC GTG GCC GTC AGT CTG TTG GCG CAG ACT CTG AAC GTG AGT CAG 2256 
737 Gin Gly Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin 752 



2257 TCA CTG CCT GCG TTA TTG TTG CGC TGG AGT GGA CAA ACA ACC TAC CAG 2 304 
753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 763 



2305 TGG TTG AGT GCG ACT TGG GCA TTG AAG GAT GCC GTT AAG ACT GCC GCC 2352 
769 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 734 



2353 GAT ATT CCC GCT GAC TAT CTG CGT CAA TTA CGT GAA GTG GTA CGC CGC 2400 
735 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 300 



2401 TCC TTG TTG ACC CAA CAA TTC ACG CTG AGT CCT GCA ATG GTG CAA ACC 244 S 
301 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Met Val Gin Thr 3 16 
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2-145 TTG CTG GAC TAT CCA GCC TAT TTT GGC GCT TCC OCA GAA AC A jT3 ACC Ziii 

317 Leu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr '.'si Thr 

5 249? GAT ATC ACT TTG TGG ATG CTT TAT ACC CTG AGC TGT TAT AGC GAT TTA 254 4 

833 Asp He Ser Leu Trp Mec Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 348 

2545 TTG CTC CAA ATG GGT GAA GCT GGT GGT ACC GAA GAT GAT GTA CTG GCC 2 35 2 

10 849 Leu Leu Gin Mec Gly Glu Ala Gly Gly Thr Glu Asp Asp Val Leu Ala 364 

259 3 TAC TTA CGC ACA GCT AAT GCT ACC AC A CCG TTG AGC CAA TCT GAT GCT 264 0 

865 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 330 

15 

2641 GCA CAG ACG TTG GCA ACG CTA TTG GGT TGG GAG GTT AAC GAG TTG CAA 2 68 8 

381 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 396 

2689 GCC GCT TGG TCG GTA TTG GGC GGG ATT GCC AAA ACC ACA CCG CAA CTG 27 3 6 

397 Ala Ala Trp Ser Val Leu Gly Gly He Ala Lys Thr Thr Pro Gin Leu 512 

25 27 37 GAT GCG CTT CTG CGT TTG CAA CAG GCA CAG AAC CAA ACT GGT CTT GGC 27 3 4 

913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 923 

27 85 GTT ACA CAG CAA CAG CAA GGC TAT CTC CTG ACT CGT GAC AGT GAT TAT 283 2 

30 929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944 



20 



35 



65 



283 3 ACC CTT TGG CAA AGC ACC GGT CAG GCG CTG GTG GCT GGC GTA TCC CAT 233 0 

945 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 960 

2881 GTC AAG GGC AGT AAC TGA 2898 

961 Val Lys Gly Ser Asn End 966 



40 

(2) INFORMATION FOR SEQ ID NO: 57 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 9 65 amino acids 

(B) TYPE: amino acid 
45 (C) TOPOLOGY: 1 inea 

r 

(ii) MOLECULE TYPE: protein 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 (TccA peptide) 

Features From To Description 

1 10 SEQ ID NO: 8 

1 Met Asn Gin Leu 

55 



60 



1 


Met 


Asn 


Gin 


Leu 


Ala 


Ser 


Pro 


Leu 


He 


Ser 


Arg 


Thr 


Glu 


Glu 


lie 


His 


16 


17 


Asn 


Leu 


Pro 


Gly 


Lys 


Leu 


Thr 


Asp 


Leu 


Gly 


Tyr 


Thr 


Ser 


Val 


Phe 


Asp 


32 


33 


Val 


Val 


Arg 


Met 


Pro 


Arg 


Glu 


Arg 


Phe 


lie 


Arg 


Glu 


His 


Arg 


Ala 


Asp 


43 


49 


Leu 


Gly 


Arg 


Ser 


Ala 


Glu 


Lys 


Met 


Tyr 


Asp 


Leu 


Ala 


Val 


Gly 


T/r 


Ala 


64 


65 


His 


Gin 


Val 


Leu 


His 


His 


Phe 


Arg 


Arg 


Asn 


S r 


Leu 


Ser 


Glu 


Ala 


Val 


30 


31 


Gin 


Phe 


Gly 


Leu 


Arg 


Ser 


Pro 


Phe 


Ser 


Val 


S r 


Gly 


Pro 


Asp 


Tyr 


Ala 


95 
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s" 


Asn 


Gin 


Phe 


Leu 


Asp 


Ala 


Asn 


Thr 


Gly 


Trp 


Lys 


Asp 


Lys 


Ala 


Pro 


Ser 




113 


Gly 


Ser 


Pro 


Glu 


Ala 


Asn 


Asp 


Ala 


Pro 


Va 1 


Ala 


Tyr 


Leu 


Thr 


His 


lie 


5 


129 


Tyr 


Gin 


Leu 


Ala 


Leu 


Glu 


Gin 


Glu 


Lys 


Asn 


Gly 


Ala 


Thr 


Thr 


He 


Met 




145 


Asn 


Thr 


Leu 


Ala 


Glu 


Arg 


Arg 


Pro 


Asp 


Leu 


Gly 


Ala 


Leu 


Leu 


He 


Asn 


10 


161 


Asp 


Lys 


Ala 


He 


Asn 


Glu 


Val 


He 


Pro 


Gin 


Leu 


Gin 


Leu 


Val 


Asn 


Glu 


177 


He 


Leu 


Ser 


Lys 


Ala 


He 


Gin 


Lys 


Lys 


Leu 


Ser 


Leu 


Thr 


Asp 


Leu 


Glu 




193 


Ala 


Val 


Asn 


Ala 


Arg 


Leu 


Ser 


Thr 


Thr 


Arg 


Tyr 


Pro 


Asn 


Asn 


Leu 


Pro 


15 


209 


Tyr 


His 


Tyr 


Gly 


His 


Gin 


Gin 


He 


Gin 


Thr 


Ala 


Gin 


Ser 


Val 


Leu 


Gly 




225 


Thr 


Thr 


Leu 


Gin 


Asp 


He 


Thr 


Leu 


Pro 


Gin 


Thr 


Leu 


Asp 


Leu 


Pro 


Gin 


20 


241 

257 


Asn 
Ala 


Phe 
Leu 


Trp 
Thr 


Ala 
Arg 


Thr 
Leu 


Ala 
Gin 


Lys 
He 


Gly 
Met 


Lys 
Ala 


Leu 
Ser 


Ser 
Gin 


Asp 
Phe 


Thr 
Ser 


Thr 
Pro 


Ala 
Glu 


Ser 
Gin 




273 


Gin 


Lys 


lie 


He 


Thr 


Glu 


Thr 


Val 


Gly 


Gin 


Asp 


Phe 


Tyr 


Gin 


Leu 


Asn 


25 


289 


Tyr 


Gly 


Asp 


Ser 


Ser 


Leu 


Thr 


Val 


Asn 


Ser 


Phe 


Ser 


Asp 


Met 


Thr 


He 




305 


Met 


Thr 


Asp 


Arg 


Thr 


Ser 


Leu 


Thr 


Val 


Pro 


Gin 


Val 


Glu 


Leu 


Met 


Leu 


30 


321 
337 


Cys 
Ser 


Ser 
Gly 


Thr 
Asp 


Val 
Thr 


Gly 
Thr 


Gly 
Ala 


Ser 
Thr 


Thr 
Pro 


Val 
Phe 


Val 
Ala 


Lys 
Tyr 


Ser 
Gly 


Asp 
Ala 


Asn 
Arg 


Val 
Phe 


Ser 
He 




353 


His 


Ala 


Gly 


Lys 


Pro 


Glu 


Ala 


He 


Thr 


Leu 


Ser 


Arg 


Ser 


Gly 


Ala 


Glu 


35 


369 


Ala 


His 


Phe 


Ala 


Leu 


Thr 


Val 


Asn 


Asn 


Leu 


Thr 


Asp 


Asp 


Lys 


Leu 


Asp 




335 


Arg 


lie 


Asn 


Arg 


Thr 


Val 


Arg 


Leu 


Gin 


Lys 


Trp 


Leu 


Asn 


Leu 


Pro 


Tyr 


40 


401 
417 


Glu 
Asn 


Asp 
Thr 


He 
Ala 


Asp 
Leu 


Leu 
Ser 


Leu 
Met 


Val 
Asn 


Thr 
Asp 


Ser 
Asn 


Ala 
Thr 


Met 
Leu 


Asp 

Arg 


Ala 
Met 


Glu 
Leu 


Thr 
Gly 


Gly 
Val 




433 


Phe 


Lys 


His 


Tyr 


Gin 


Ala 


Lys 


Tyr 


Gly 


Val 


Ser 


Ala 


Lys 


Gin 


Phe 


Ala 


45 


449 


Gly 


Trp 


Leu 


Arg 


Val 


Val 


Ala 


Pro 


Phe 


Ala 


He 


Thr 


Pro 


Ala 


Thr 


Pro 




465 


Phe 


Leu 


Asp 


Gin 


Val 


Phe 


Asn 


Ser 


Val 


Gly 


Thr 


Phe 


Asp 


Thr 


Pro 


Phe 


50 


431 


Val 


He 


Asp 


Asn 


Gin 


Asp 


Phe 


Val 


Tyr 


Thr 


Leu 


Thr 


Thr 


Gly 


Gly 


Asp 


497 


Gly 


Ala 


Arg 


Val 


Lys 


His 


lie 


Ser 


Thr 


Ala 


Leu 


Gly 


Leu 


Asn 


His 


Arg 




513 


Gin 


Phe 


Leu 


Leu 


Leu 


Ala 


Asp 


Asn 


lie 


Ala 


Arg 


Gin 


Gin 


Gly 


Asn 


Val 


55 


529 


Thr 


Gin 


Ser 


Thr 


Leu 


Asn 


Cys 


Asn 


Leu 


Phe 


Val 


Val 


Ser 


Ala 


Phe 


Tyr 




545 


Arg 


Leu 


Ala 


Asn 


Leu 


Ala 


Arg 


Thr 


Leu 


Gly 


He 


Asn 


Pro 


Glu 


Ser 


Phe 


60 


551 
577 


Cys 
Gin 


Ala 
Leu 


Leu 
Ala 


Val 
Gly 


Asp 
Lys 


Arg 
Pro 


Leu 
Thr 


Asp 
He 


Ala 
Thr 


Gly 
Val 


Thr 
Pro 


Gly 
Gin 


He 
Lys 


Val 
Asp 


Trp 

Ser 


Gin 
Pro 




593 


Leu 


Ala 


Ala 


Asp 


He 


Leu 


Ser 


Leu 


Leu 


Gin 


Ala 


Leu 


Ser 


Ala 


He 


Ala 


65 


609 


Gin 


Trp 


Gin 


Gin 


Gin 


His 


Asp 


Leu 


Glu 


Phe 


Ser 


Ala 


Leu 


Leu 


LeU 


Leu 



n: 

128 
14 J 
16 0 
176 
192 
203 
224 
240 
256 
272 
238 
304 
320 
336 
352 
363 
334 
400 
416 
432 
443 
464 
430 
496 
512 
523 
544 
560 
576 
552 
603 
624 
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£25 


Leu 


Ser 


Asp 


Asn 


Pro 


He 


Ser 


Thr 


Ser 


Gin 


Gly 


Thr 


Asp 


Asp 


•3 in 


Leu 


z -1 0 




641 


Asn 


Phe 


lie 


Arg 


Gin 


Val 


Trp 


Gin 


Asn 


Leu 


Gly 


Ser 


Thr 


Phe 


Val 


Cly 


6 5 6 


5 


657 


Ala 


Thr 


Leu 


Leu 


Ser 


Arg 


Ser 


Gly 


Ala 


Pro 


Leu 


Val 


Asp 


Thr 


Asn 


Gly 


672 




673 


His 


Ala 


He 


Asp 


Trp 


Phe 


Ala 


Leu 


Leu 


Ser 


Ala 


Gly 


Asn 


Ser 


Pro 


Leu 


633 


10 


689 


lie 


Asp 


Lys 


Val 


Gly 


Leu 


Val 


Thr 


Asp 


Ala 


Gly 


He 


Gin 


Ser 


Val 


lie 


7G-1 




705 


Ala 


Thr 


Val 


Val 


Asn 


Thr 


Gin 


Ser 


Leu 


Ser 


Asp 


Glu 


Asp 


Lys 


Lys 


Leu 


720 




721 


Ala 


lie 


Thr 


Thr 


Leu 


Thr 


Asn 


Thr 


Leu 


Asn 


Gin 


Val 


Gin 


Lys 


Thr 


Gin 


'36 


15 


737 


Gin 


Gly 


Val 


Ala 


Val 


Ser 


Leu 


Leu 


Ala 


Gin 


Thr 


Leu 


Asn 


Val 


Ser 


Gin 


752 




753 


Ser 


Leu 


Pro 


Ala 


Leu 


Leu 


Leu 


Arg 


Trp 


Ser 


Gly 


Gin 


Thr 


Thr 


Tyr 


Gin 


753 


20 


769 


Trp 


Leu 


Ser 


Ala 


Thr 


Trp 


Ala 


Leu 


Lys 


Asp 


Ala 


Val 


Lys 


Thr 


Ala 


Ala 


734 




735 


Asp 


lie 


Pro 


Ala 


Asp 


Tyr 


Leu 


Arg 


Gin 


Leu 


Arg 


Glu 


Val 


Val 


Arg 


Arg 


300 




301 


Ser 


Leu 


Leu 


Thr 


Gin 


Gin 


Phe 


Thr 


Leu 


Ser 


Pro 


Ala 


Met 


Val 


Gin 


Thr 


316 


25 


317 


Leu 


Leu 


Asp 


Tyr 


Pro 


Ala 


Tyr 


Phe 


Gly 


Ala 


Ser 


Ala 


Glu 


Thr 


Val 


Thr 


332 




333 


Asp 


He 


Ser 


Leu 


Trp 


Met 


Leu 


Tyr 


Thr 


Leu 


Ser 


Cys 


Tyr 


Ser 


Asp 


Leu 


843 




349 


Leu 


Leu 


Gin 


Met 


Gly 


Glu 


Ala 


Gly 


Gly 


Thr 


Glu 


Asp 


Asp 


Val 


Leu 


Ala 


864 


865 


Tyr 


Leu 


Arg 


Thr 


Ala 


Asn 


Ala 


Thr 


Thr 


Pro 


Leu 


Ser 


Gin 


Ser 


Asp 


Ala 


830 




381 


Ala 


Gin 


Thr 


Leu 


Ala 


Thr 


Leu 


Leu 


Gly 


Trp 


Glu 


Val 


Asn 


Glu 


Leu 


Gin 


896 


35 


397 


Ala 


Ala 


Trp 


Ser 


Val 


Leu 


Gly 


Glv 


He 


Ala 




1 111 


Th y- 

I rii 


Pro 


ij in 


Leu 


912 




913 


Asp 


Ala 


Leu 


Leu 


Arg 


Leu 


Gin 


Gin 


Ala 


Gin 


Asn 


Gin 


Thr 


Gly 


Leu 


Gly 


928 


40 


929 


Val 


Thr 


Gin 


Gin 


Gin 


Gin 


Gly 


Tyr 


Leu 


Leu 


Ser 


Arg 


Asp 


Ser 


Asp 


Tyr 


944 




945 


Thr 


Leu 


Trp 


Gin 


Ser 


Thr 


Gly 


Gin 


Ala 


Leu 


Val 


Ala 


Gly 


Val 


Ser 


His 


960 




961 


Val 


Lys 


Gly 


Ser 


Asn 


965 























45 

(2) INFORMATION FOR SEQ ID NO: 58 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4698 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

55 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 <CCCB) 

1 ATG TTA TCG ACA ATG GAA AAA CAA CTG AAT GAA TCC CAG CGT GAT GCG 

1 Mec Leu Ser Thr Met Glu Lys Gin Leu Asn Glu Ser Gin Arg Asd Ala 

oU 

49 TTG GTG ACT GGC TAT ATG AAT TTT GTG GCG CCG ACG TTG AAA GGC GTC 

17 Leu Val Thr Gly Tyr Met Asn Phe Val Ala Pro Thr Leu Lys Gly Val 

65 
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9"» .-.GT GGT CAG CCG GTG 
3 3 ssr Gly Gin Pro Val 



145 GAC CCG GAA GTG GCT 
49 Asp Pro Glu Val Ala 



193 ATT GCC AGC ATA CAG 
65 lie Ala Ser lie Gin 



241 CCG GGG CGT CAG GCG 
81 Pro Gly Arg Gin Ala 



289 AAT GAT AAC CAA TAT 
97 Asn Asp Asn Gin Tyr 



3 37 TAC GCT GAA AAC TAT 
113 Tyr Ala Glu Asn Tyr 



3 35 TAT TTC TCG GAG CTG 
129 Tyr Phe Ser Glu Leu 



433 GAT CGT GTG CAG GAT 
145 Asp Arg Val Gin Asp 



481 GTG AGT AAT CTA TAT 
161 Val Ser Asn Leu Tyr 



529 GAC CAA GCT ATC TAC 
177 Asp Gin Ala lie Tyr 



577 CGC TAC TAC TGG CGT 
193 Arg Tyr Tyr Trp Arg 



625 GCA GGG AAT CCG GTG 
209 Ala Gly Asn Pro Val 



673 ACT TTG CCG CTG TCT 
22 5 Thr Leu Pro Leu Ser 



721 GTA TTT TAT AAT GAT 
241 Val Phe Tyr Asn Asp 



7 69 GCA GTA CAG AAG GAT 
257 Ala Val Gin Lys Asp 



317 TAC AAC ATA AAG TTT 
273 Tyr Asn lie Lys Phe 



865 CCG AAT ACG ACC ACG 
289 Pro Asn Thr Thr Thr 



ACG GTG GAA GAT TTA TAC 
Thr Val Glu Asp Leu Tyr 



GAT GAG GTT GAG ACG AGT 
Asp Glu Val Glu Thr Ser 



CAA TAT ATG ACT CGT CTG 
Gin Tyr Met Thr Arg Leu 



ATG GAG CCT TCT ACA GCT 
Met Glu Pro Ser Thr Ala 



GCT ATC TGG GCT GCG GGG 
Ala He Trp Ala Ala Gly 



ATT TCA CCC ATC ACC CGG 
He Ser Pro lie Thr Arg 



GAG ACG ACT TTA AAT CAG 
Glu Thr Thr Leu Asn Gin 



GCT GTT TTG GCG TAT CTC 
Ala Val Leu Ala Tyr Leu 



GTG CTC AGT -GGT TAT ATT 
Val Leu Ser Gly Tyr He 



TAC TTT ATT GGT CGC ACT 
Tyr Phe He Gly Arg Thr 



CAG ATG GAT TTG AGT AAG 
Gin Mec Asp Leu Ser Lys 



ACG CCA AAT TGC TGG AAT 
Thr Pro Asn Cys Trp Asn 



GGT GAT ACG GTG CTG GAG 
Gly Asp Thr Val Leu Glu 



CGA CTA TAT GTG GCT TGG 
Arg Leu Tyr Val Ala Trp 



GCT GAC GGT AAA AAC ATC 
Ala Asp Gly Lys Asn He 



GGT TAT AAA CGT TAT GAT 
Gly Tyr Lys Arg Tyr Asp 



TTA ATG ACA CAA CAA GCA 
Leu Met Thr Gin Gin Ala 
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GAA TAT TTG CTG ATT 144 
Glu Tyr Leu Leu He 4 a 



CGG GTA GCA CAA GCG 192 
Arg Val Ala Gin Ala 64 



CTC AAC GGC TCT GAA j4 0 
Val Asn Gly Ser Glu 30 



AAC GAA TGG CCT GAT 2 33 
Asn Glu Trp Arg Asp 96 



GCT GAG GTT CGA AAT 3 36 
Ala Glu Val Arg Asn 112 



CAG GAA AAA AGC CAT 3 34 
Gin Glu Lys Ser His 123 



AAT CGA CTC GAT CCG 43 2 
Asn Arg Leu Asp Pro 144 



AAT GAG TTT GAG GCA 4 80 
Asn Glu Phe Glu Ala 160 



AAT CAG GAT AAA TTT 528 
Asn Gin Asp Lys Phe 17 6 



ACC ACT AAA CCG TAT 57 5 
Thr Thr Lys Pro Tyr 192 



AAC CGT CAA GAT CCG 624 
Asn Arg Gin Asp Pro 208 



GAT TGG CAG GAA ATC 67 2 
Asp Trp Gin Glu lie 224 



CAT ACA GTT CGC CCG 7 20 
His Thr Val Arg Pro 240 



GTT GAG CGT GAC CCG "63 
Val Glu Arg Asp Pro 256 



GGT AAA ACC CAT GCC 3 16 
Gly Lys Thr His Ala 2~ 2 



GAT ACT TGG ACA GCG 3 64 
Asp Thr Trp Thr Ala 233 



GGG GAA AGT TCA GAA 9 12 
Gly Glu Ser Ser Glu 5 "4 
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10 



913 AC A CAG CGA TCC AGC CTG CTG ATT GAT GAA TCT AGC ACC AC A TTC CSC >60 

305 Thr Gin Arg Ser Ser Leu Leu He Asp Glu Ser Ser Thr Thr Leu Arg -20 

961 CAA GTT AAT CTG TTG GCT ACC ACC GAT TTT AGT ATC GAT CCG ACS GAG iOG<3 

321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser lie Asp Pro Thr Glu 3 36 

1009 GAA ACG GAC AGT AAC CCG TAT GGC CGC CTA ATG TTG GGG GTG TTT GTC 1056 

337 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Met Leu Gly Val Phe Val 3 52 

15 1057 CGT CAA TTT GAA GGT GAT GGG GCC AAT AGA AAA AAT AAA CCC GTT GTT 1104 

353 Arg Gin Phe Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Vai 3 63 

1105 TAT GGT TAT CTC TAT TGT GAC TCA GCT TTC AAT CGT CAT GTT CTC AGG 1152 

20 3 69 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 3 34 



25 



30 



45 



50 



1153 CCG TTA AGT AAG AAC TTT TTG TTC AGT ACT TAC CGT GAT GAA ACG GAT 12 00 

385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp 400 

1201 GGT CAA AAC AGC TTG CAA TTT GCG GTA TAC GAT AAA AAG TAT GTA ATT 12 4 3 

401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val He 416 

1249 ACT AAG GTT GTT AC A GGT GCA ACG GAA GAT CCC GAA AAT AC A GGA TGG 12 96 

417 Thr Lys Val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 43 2 



35 1297 GTA AGT AAA GTT GAT GAC TTG AAA CAA GGC ACT ACT GGG GCC TAT GTG 13 44 

43 3 Val Ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 44 8 

134 5 TAT ATC GAT CAA GAT GGC CTG ACG CTT CAT ATA CAA ACC AC A ACT AAT 13 92 

40 449 Tyr He Asp Gin Asp Gly Leu Thr Leu His lie Gin Thr Thr Thr Asn 464 



1393 GGG GAT TTT ATT AAC CGT CAT ACG TTT GGA TAT AAC GAT CTT GTA TAT 1440 

465 Gly Asp Phe He Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val Tyr 430 

1441 GAT TCT AAG TCT GGT TAT GGT TTC ACG TGG TCA GGA AAT GAA GGT TTT 1488 

481 Asp Ser Lys Ser Gly Tyr Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 49 6 

1489 TAT CTG GAT TAC CAT GAT GGA AAT TAT TAC ACC TTT CAT AAT GCA ATA 153 6 

497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Tyr Thr Phe His Asn Ala He 512 



55 1537 ATC AAC TAC TAT CCG TCT GGA TAT GGT GGT GGA TCT GTT CCT AAT GGA 1534 

513 He Asn Tyr Tyr Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 523 

158 5 ACG TGG GCG TTA GAG CAA AGG ATT AAT GAG GGA TGG GCT ATT GCT CCC 16 32 

60 529 Thr Trp Ala Leu Glu Gin Arg He Asn Glu Gly Trp Ala He Ala Pro 544 



65 



163 3 CTG CTT GAT ACT CTC CAT ACT GTT ACT GTG AAG GGC AGT TAT ATC GCT 1630 
545 Leu Leu Asp Thr Leu His Thr Val Thr Val Lys Gly Ser Tyr He Ala 560 
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1631 TGG GAA GGG GAA ACA OCT ACC OCT TAT AAT CTG TAT ATT CCA GAT GGT 1"23 

561 Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr lie Pro Asp Gly 576 

5 1729 ACC GTG TTG CTA GAT TGG TTT GAT AAA ATA AAT TTT GCT ATT GGT CTT 177 6 

577 Thr Val Leu Leu Asp Trp Phe Asp Lys lie Asn Phe Ala He Cly Leu 592 

1777 AAT AAG CTT GAG TCT GTA TTT ACG TCG CCA GAT TGG CCA AC A CTA ACC 182 4 

10 593 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 608 



15 



20 



35 



40 



55 



60 



1825 ACT ATC AAA AAT TTC AGT AAA ATC GCC GAT AAC CGC AAA TTC TAT CAG 137 2 

609 Thr He Lys Asn Phe Ser Lys He Ala Asp Asn Arg Lys Phe Tyr Gin 624 

1373 GAA ATC AAT GCT GAG ACG GCG GAT GGA CGC AAC CTG TTT AAA CGT TAC 192 0 

625 Glu He Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Tyr 640 

19 21 AGT ACT CAA ACT TTC GGA CTT ACC AGC GGT GCG ACT TAT TCT ACA ACT 19 63 

641 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr Tyr Ser Thr Thr 656 



25 19 69 TAT ACT TTG TCT GAG GCG GAT TTC TCC ACT GAT CCG GAC AAA AAC TAC 2016 

657 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn Tyr 67 2 

2017 CTA CAG GTT TCT TTG AAT GTC GTG TGG GAT CAT TAT GAC CGC CCG TCA 2 06 4 

30 673 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 688 



2065 GGG AAA AAA GGG GCT TAT TCT TGG GTC AGT AAG TGG TTT AAC GTC TAT 2112 

689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val Tyr 7 04 

2113 GTT GCG TTG CAA GAT AGC AAA GCT CCG GAT GCC ATT CCT CGA TTA GTT 2160 

705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala He Pro Arg Leu Val 7 20 

2161 TCC CGT TAC GAT AGT AAA CGT GGT CTG GTG CAA TAT CTG GAC TTC TGG 2 208 

721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 73 6 



45 2209 ACC TCA TCA TTA CCC GCG AAA ACC CGT CTT AAC ACC ACC TTT GTG CGT 2 256 

737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 7 52 

2257 ACT TTG ATT GAG AAG GCT AAT CTG GGG CTG GAT AGT TTG CTG GAT TAC 2304 

50 753 Thr Leu He Glu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp Tyr 763 



23 05 ACC TTG CAG GCA GAT CCT TCT CTG GAA GCA GAT TTA GTG ACT GAC GGC 2 3 52 

769 Thr Leu Gin Ala Asp Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 7 34 

2353 AAA AGC GAA CCA ATG GAC TTT AAT GGT TCA AAC GGT CTC TAT TTC TGG 2 400 

785 Lys Ser Glu Pro Met Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 3 00 

2401 GAA TTG TTC TTT CAC CTG CCG TTT TTG GTT GCT ACA CGC TTT GCC AAC 2443 

301 Glu Leu Phe Phe His Leu Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 3 16 



65 244 9 GAA CAG CAA TTT TCG CCG GCA CAA AAG AGT TTG CAT TAC ATC TTT GAC 2 49< 

817 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His Tyr He Phe Asp 332 
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2-437 CCG GCG ATG AAA AAC AAG CCA CAC AAT GCC CCG GCT TAT TGG AAT GTA 2 c .Ai 

333 Pro Ala Met Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn '.'■»: i43 

254 5 CGT CCG TTG GTT GAA GGA AAC AGC GAT TTG TCA CGT CAT TTG GAC GAT J.5 52 
345 Arg Pro Leu Val Glu Gly Asn Ser Asp Leu Ser Arg His Leu Asp Asp 364 

10 

255 3 TCT ATA GAC CCA GAT ACT CAA GCT TAT GCT CAT CCG GTG ATA TAC CAG 2 64 0 
365 Ser lie Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val lie Tyr Gin 330 

15 2 641 AAA GCG GTG TTT ATT GCC TAT GTC AGT AAC CTG ATT GCT CAG GGA GAT 2 63a 

381 Lys Ala Val Phe lie Ala Tyr Val Ser Asn Leu lie Ala Gin Gly Asp iit 

2639 ATG TGG TAT CGC CAA TTG ACT CGT GAC GGT CTG ACT CAG GCC CGT GTC 27 3 6 

20 897 Met Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 512 



25 



30 



45 



50 



2~37 TAT TAC AAT CTG GCC GCT GAA TTG CTA GGG CCT CGT CCG GAT GTA TC3 2734 

513 Tyr Tyr Asn Leu Ala Ala Glu Leu Leu Gly Pro Arg Pro Asp Val Ser 528 

273 5 CTG AGT AGC ATT TGG ACG CCG CAA ACC CTG GAT ACC TTA GCA GCC GGG 233 2 

929 Leu Ser Ser lie Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala Gly 944 

2833 CAA AAA GCG GTT TTA CGT GAT TTT GAG CAC CAG TTG GCT AAT AGT GAT 2 330 

945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 560 



35 2881 ACC GCT TTA CCC GCA TTG CCG GGC CGC AAT GTC AGC TAC TTG AAA CTG 29 2 3 

961 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Leu 97 6 

2929 GCA GAT AAT GGC TAC TTT AAT GAA CCG CTC AAT GTT CTG ATG TTG TCT 257 6 

40 977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Met Leu Ser 992 



2977 CAC TGG GAT ACG TTG GAT GCA CGG TTA TAC AAT CTG CGT CAT AAC CTG 3 024 

99 3 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Leu 1003 

3 025 ACC GTT GAT CGC AAG CCG CTT TCG CTG CCG CTG TAT GCT GCG CCT GTT 3 072 

1009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val 1024 

3 07 3 GAT CCG GTA GCG TTG TTG GCT CAG CGT GCT CAG TCC GGC ACG TTG ACG 3 120 

1025 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Leu Thr 1040 



55 3121 AAT GGC GTC AGT GGC GCC ATG TTG ACG GTG CCG CCA TAC CGT TTC AGC 3 153 

1041 Asn Gly Val Ser Gly Ala Met Leu Thr Val Pro Pro Tyr Arg Phe Ser 1056 

3169 GCT ATG TTG CCG CCA GCT TAC AGC GCC GTG GGT ACG TTG ACC ACT TTT 3 216 

60 1057 Ala Met Leu Pro Arg Ala Tyr Ser Ala Val Gly Thr Leu Thr Ser Phe 1072 



65 



3 217 GGT CAG AAC CTG CTT AGT TTG TTG GAA CGT AGC GAA CGA GCC TGT CAA 3-64 
1073 Gly Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin 1036 
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:265 GAA GAG TTG GCG CAA CAG CAA CTG TTG GAT ATG TCC AGC TAT GCC ATT ; 
1039 Glu Glu Lsu Ala Gin Gin Gin Leu Leu Asp Met Ssr Ser Tyr Ala Zl± 1K4 



5 3 313 ACG TTG CAA CAA CAG GCG CTG GAT GGA TTG GCG GCA GAT CGT CTG GCG 3 3 60 

1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Ala 1120 

3 361 CTG CTA GCT ACT CAG GCT ACG GCA CAA CAG CGT CAT GAC CAT TAT TAC 3 4 03 

|() 1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr 113 6 



15 



20 



40 



55 



60 



3409 ACT CTG TAT CAG AAC AAC ATC TCC AGT GCG GAA CAA CTG GTG ATG GAC 3 45c 

1137 Thr Leu Tyr Gin Asn Asn lie Ser Ser Ala Glu Gin Leu Val Met Asp 1152 

3 457 ACC CAA ACG TCA GCA CAA TCC CTG ATT TCT TCT TCC ACT GGT GTA CAA 3 504 

1153 Thr Gin Thr Ser Ala Gin Ser Leu lie Ser Ser Ser Thr Gly Val Gin 1168 

3 505 ACT GCC AGT GGG GCA CTG AAA GTG ATC CCG AAT ATC TTT GGT TTG GCT 3 552 

1169 Thr Ala Ser Gly Ala Leu Lys Val He Pro Asn He Phe Gly Leu Ala 1134 



25 3 55 3 GAT GGC GGC TCG CGC TAT GAA GGA GTA ACG GAA GCG ATT GCC ATC GGG 3 600 

1185 Asp Gly Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala He Ala lie Gly 1200 

3601 TTA ATG GCT GCC GGA CAA GCC ACC AGC GTG GTG GCC GAG CGT CTG GCA 3 643 

30 1201 Leu Met: Ala Ala Gly Gin Ala Thr Ser Val Val Ala Glu Arg Leu Ala 1216 

364 9 ACC ACG GAG AAT TAC CGC CGC CGC CGT GAA GAG TGG CAA ATC CAA TAC 3 69 6 

1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin lie Gin Tyr 1232 

35 

3 697 CAG CAG GCA CAG TCT GAG GTC GAC GCA TTA CAG AAA CAG TTG GAT GCG 3 744 

1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 1243 

3745 CTG GCA GTG CGC GAG AAA GCA GCT CAA ACT TCC CTG CAA CAG GCG AAG 3 792 

1249 Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 1264 

45 3793 GCA CAG CAG GTA CAA ATT CGG ACC ATG CTG ACT TAC TTA ACT ACT CGT 3 340 

1265 Ala Gin Gin Val Gin lie Arg Thr Met Leu Thr Tyr Leu Thr Thr Arg 1230 

3 841 TTC ACC CAG GCG ACT CTG TAC CAG TGG CTG AGT GGT CAA TTA TCC GCG 3 33 3 

50 1231 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 1236 



3 389 TTG TAT TAT CAA GCG TAT GAT GCC GTG GTT GCT CTC TGC CTC TCC GCC 39 36 

1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu Cys Leu Ser Ala 13 12 

3937 CAA GCT TGC TGG CAG TAT GAA TTG GGT GAT TAC GCT ACC ACT TTT ATC 3934 

1313 Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe I1& 1323 

3 985 CAG ACC GGT ACC TGG AAC GAC CAT TAC CGT GGT TTG CAA GTG GGG GAG 4 032 

1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu 13 44 



65 403 3 ACA CTG CAA CTC AAT TTG CAT CAG ATG GAA GCG GCC TAT TTA GTT CGT 4 030 

1345 Thr Leu Gin Leu Asn Leu His Gin Met Glu Ala Ala Tyr Leu Val Arg i3 6u 
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4 081 CAC GAA CGC C3T CTT AAT CTG ATC CGT ACT GTG TCG CTC AAA AGC CTA 
13 61 His Glu Arg Arg Leu Asn Yai lie Arg Thr Val 3er Leu Lys Ser Leu 



4 129 TTG GGT GAT GAT GGT TTT GGT AAG TTA AAA ACC GAA GGC AAA GTC GAC 
1377 Leu Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 



•1177 TTT CCA TTA AGC GAA AAG CTG TTT GAC AAC GAC TAT CCG GGG CAC TAT 
1393 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 



4225 TTG CGC CAG ATT AAA ACT GTG TCA GTG ACG TTG CCG ACG TTA GTC GGG 
1409 Leu Arg Gin lie Lys Thr Val Ser Val Thr Leu Pro Thr Leu Val Gly 



4273 CCG TAT CAA AAC GTG AAG GCA ACG CTC ACT CAG ACC AGC AGC ACT ATA 
1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser He 



4 3 21 TTG TTA GCA GCA GAT ATC AAT GGT GTT AAA CGT CTC AAT GAT CCG AC A 
1441 Leu Leu Ala Ala Asp He Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 



4 3 69 GGT AAA GAG GGT GAT GCG ACG CAT ATT GTC ACC AAT CTG CGT GCC AGC 
1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 



4417 CAG CAG GTG GCG CTC TCT TCT GGC ATT AAT GAT GCC GGT AGC TTT GAG 
1473 Gin Gin Val Ala Leu Ser Ser Gly He Asn Asp Ala Gly Ser Phe Glu 



44 65 TTG CGT TTG GAA GAT GAG CGC TAT CTA TCA TTT GAG GGG ACT GGA GCT 
1489 Leu Arg Leu Glu Asp Glu Arg Tyr Leu Ser Phe Glu Gly Thr Gly Ala 



4513 GTT TCC AAA TGG ACT CTT AAC TTC CCG CGT TCT GTG GAT GAG CAT ATT 
1505 Val Ser Lys Trp Thr Leu Asn Phe Pro Arg Ser Val Asp Glu His He 



4561 GAC GAT AAG ACA TTG AAA GCG GAT GAG ATG CAG GCC GCA CTG TTG GCG 
1521 Asp Asp Lys Thr Leu Lys Ala Asp Glu Mec Gin Ala Ala Leu Leu Ala 



4 609 AAT ATG GAT GAT GTG CTG GTG CAG GTG CAT TAT ACC GCC TGC GAC GGC 
1537 Asn Met Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 



4657 GGC GCC AGT TTC GCA AAC CAG GTC AAG AAA ACA CTC TCT TAA 4 69 8 
1553 Gly Ala Ser Phe Ala Asn Gin Val Lys Lys Thr Leu Ser End 1566 



(2) INFORMATION FOR SEQ ID NO: 59 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1665 amino acids 

(B) TYPE: amino acid 

(C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59 (TccB peptide) 
Features From To Description 
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i il SEQ ID tlO: 7 



■J 


i 


Mec 


Leu 


Ser 


Thr 


Mec 


Glu 


Lys 


Gin 


Leu 


Asn 


Glu 


Ser 


Gin 


Arg 


Asp 


Ala 


i. 6 


1"? 


Leu 


Val 


Thr 


Gly 


Tyr 


Mec 


Asn 


Phe 


Val 


Ala 


Pro 


Thr 


Leu 


Lys 


Gly 


Val 


52 




33 


Ser 


Gly 


Gin 


Pro 


Val 


Thr 


Val 


Glu 


Asp 


Leu 


Tyr 


Glu 


Tyr 


Leu 


Leu 


He 


43 


10 


45 


Asp 


Pro 


Glu 


Val 


Ala 


Asp 


Glu 


Val 


Glu 


Thr 


Ser 


Arg 


Val 


Ala 


Gin 


Ala 


64 




65 


He 


Ala 


Ser 


lie 


Gin 


Gin 


Tyr 


Met 


Thr 


Arg 


Leu 


Val 


Asn 


Gly 


Ser 


Glu 


30 


I c 

1 J 


31 


Pro 


Gly 


Arg 


Gin 


Ala 


Mec 


Glu 


Pro 


Ser 


Thr 


Ala 


Asn 


Glu 


Trp 


Arg 


Asp 


9 6 


97 


Asn 


Asp 


Asn 


Gin 


Tyr 


Ala 


He 


Trp 


Ala 


Ala 


Gly 


Ala 


Glu 


Val 


Arg 


Asn 


112 




113 


Tyr 


Ala 


Glu 


Asn 


Tyr 


He 


Ser 


Pro 


He 


Thr 


Arg 


Gin 


Glu 


Lys 


Ser 


His 


123 


20 


129 


T/r 


Phe 


Ser 


Glu 


Leu 


Glu 


Thr 


Thr 


Leu 


Asn 


Gin 


Asn 


Arg 


Leu 


Asp 


Pro 


144 




145 


Asp 


Arg 


Val 


Gin 


Asp 


Ala 


Val 


Leu 


Ala 


Tyr 


Leu 


Asn 


Glu 


Phe 


Glu 


Ala 


160 


— J 


161 


Val 


Ser 


Asn 


Leu 


Tyr 


Val 


Leu 


Ser 


Gly 


T/r 


lie 


Asn 


Gin 


Asp 


Lys 


Phe 


176 


177 


Asp 


Gin 


Ala 


He 


Tyr 


Tyr 


Phe 


He 


Gly 


Arg 


Thr 


Thr 


Thr 


Lys 


Pro 


T/r 


192 




193 


Arg 


Tyr 


Tyr 


Trp 


Arg 


Gin 


Mec 


Asp 


Leu 


Ser 


Lys 


Asn 


Arg 


Gin 


Asp 


Pro 


203 


30 


209 


Ala 


Gly 


Asn 


Pro 


Val 


Thr 


Pro 


Asn 


Cys 


Trp 


Asn 


Asp 


Trp 


Gin 


Glu 


He 


224 




225 


Thr 


Leu 


Pro 


Leu 


Ser 


Gly 


Asp 


Thr 


Val 


Leu 


Glu 


His 


Thr 


Val 


Arg 


Pro 


240 


J -J 


241 


Val 


Phe 


Tyr 


Asn 


Asp 


Arg 


Leu 


Tyr 


Val 


Ala 


Trp 


Val 


Glu 


Arg 


Asp 


Pro 


256 




257 


Ala 


Val 


Gin 


Lys 


Asp 


Ala 


Asp 


Gly 


Lys 


Asn 


lie Gly 


Lys 


Thr 


His 


Ala 


272 




273 


Tyr 


Asn 


He 


Lys 


Phe 


Gly 


Tyr 


Lys 


Arg 


Tyr 


Asp 


Asp 


Thr 


Trp 


Thr 


Ala 


238 


40 


289 


Pro 


Asn 


Thr 


Thr 


Thr 


Leu 


Mec 


Thr 


Gin 


Gin 


Ala 


Gly 


Glu 


Ser 


Ser 


Glu 


304 




305 


Thr 


Gin 


Arg 


Ser 


Ser 


Leu 


Leu 


He 


Asp 


Glu 


Ser 


Ser 


Thr 


Thr 


Leu 


Arg 


.20 


45 


321 


Gin 


Val 


Asn 


Leu 


Leu 


Ala 


Thr 


Thr 


Asp 


Phe 


Ser 


He 


Asp 


Pro 


Thr 


Glu 


336 




337 


Glu 


Thr 


Asp 


Ser 


Asn 


Pro 


Tyr 


Gly 


Arg 


Leu 


Met 


Leu 


Gly 


Val 


Phe 


Val 


352 




353 


Arg 


Gin 


Phe 


Glu 


Gly 


Asp 


Gly 


Ala 


Asn 


Arg 


Lys 


Asn 


Lys 


Pro 


Val 


Val 


363 


50 


369 


Tyr 


Gly 


Tyr 


Leu 


Tyr 


Cys 


Asp 


Ser 


Ala 


Phe 


Asn 


Arg 


His 


val 


Leu 


Arg 


334 




385 


Pro 


Leu 


Ser 


Lys 


Asn 


Phe 


Leu 


Phe 


Ser 


Thr 


Tyr 


Arg 


Asp 


Glu 


Thr 


Asp 


400 


55 


401 


cly 


Gin 


Asn 


Ser 


Leu 


Gin 


Phe 


Ala 


Val 


Tyr 


Asp 


Lys 


Lys 


Tyr 


Val 


He 


416 


417 


Thr 


Lys 


Val 


Val 


Thr 


Gly 


Ala 


Thr 


Glu 


Asp 


Pro 


Glu 


Asn 


Thr 


Gly 


Trp 


432 




433 


Val 


Ser 


Lys 


Val 


Asp 


Asp 


Leu 


Lys 


Gin 


Gly 


Thr 


Thr 


Gly 


Ala 


Tyr 


Val 


448 


60 


449 


Tyr 


He 


Asp 


Gin 


Asp 


Gly 


Leu 


Thr 


Leu 


His 


He 


Gin 


Thr 


Thr 


Thr 


Asn 


464 




465 


Gly 


Asp 


Phe 


He 


Asn 


Arg 


His 


Thr 


Phe 


Gly 


Tyr 


Asn 


Asp 


Leu 


Val 


Tyr 


430 


65 


431 


Asp 


Ser 


Lys 


Ser 


Gly 


Tyr 


Gly 


Phe 


Thr 


Trp 


Ser 


Gly 


Asn 


Glu 


Gly 


Phe 


4 5 6 


497 


Tyr 


Leu 


Asp 


Tyr 


His 


Asp 


Gly 


Asn 


Tyr 


T/r 


Thr 


Phe 


His 


Asn 


Ala 


He 


512 



-246- 



SUBSnTUTE SHEET (RULE 26) 



WO 97/1 7432 



PCT/US96/18003 





5 i 3 


IU 


Asn 


Tyr 


Tyr 


Pro 


Ser 


01/ 


Tyr 


Gly 


Gly 


Gly 


Ser 


Val 


Pro 


Asn 


-3 a y 


5 


529 


Thr 


Trp 


Ala 


Leu 


Glu 


Gin 


Arg 


lie 


Asn 


Giu 


Gly 


Trp 


Ala 


lie 


Ala 


Pr^ 


5-15 


Leu 


Leu 


Asp 


Thr 


Leu 


His 


Thr 


Val 


Thr 


Val 


Lys 


Gly 


Ser 


Tyr 


He 


Ala 




561 


Trp 


Glu 


Gly 


Glu 


Thr 


Pro 


Thr 


Gly 


Tyr 


Asn 


Leu 


T/r 


He 


Pro 


Asp 


Gly 


10 


5~~ 


Thr 


7a 1 


Leu 


Leu 


Asp 


Trp 


Phe 


Asp 


Lys 


lie 


Asn 


Phe 


Ala 


lie 


Gly 


Leu 




553 


Asn 


Lys 


Leu 


Glu 


Ser 


Val 


Phe 


Thr 


Ser 


Pro 


Asp 


Trp 


Pro 


Thr 


Leu 


Thr 


15 


605 


Thr 


He 


Lys 


Asn 


Phe 


Ser 


Lys 


He 


Ala 


Asp 


Asn 


Arg 


Lys 


Phe 


T/r 


Gin 


625 


Glu 


He 


Asn 


Ala 


Glu 


Thr 


Ala 


Asp 


Gly 


Arg 


Asn 


Leu 


Phe 


Lys 


Arg 


T/r 




641 


ser 


Thr 


Gin 


Thr 


Phe 


Gly 


Leu 


Thr 


Ser 


Gly 


Ala 


Thr 


Tyr 


Ser 


Thr 


Thr 


20 


657 


Tyr 


Thr 


Leu 


Ser 


Glu 


Ala 


Asp 


Phe 


Ser 


Thr 


Asp 


Pro 


Asp 


Lys 


Asn 


T/r 




673 


Leu 


Gin 


Val 


Cys 


Leu 


Asn 


Val 


Val 


Trp 


Asp 


His 


Tyr 


Asp 


Arg 


Pro 


Ser 


25 


£39 


Gly 


Lys 


Lys 


Gly 


Ala 


Tyr 


Ser 


Trp 


Val 


Ser 


Lys 


Trp 


Phe 


Asn 


7a 1 


T/r 


705 


Val 


Ala 


Leu 


Gin 


Asp 


Ser 


Lys 


Ala 


Pro 


Asp 


Ala 


He 


Pro 


Arg 


Leu 


Val 




721 


Ser 


Arg 


Tyr 


Asp 


Ser 


Lys 


Arg 


Gly 


Leu 


Val 


Gin 


Tyr 


Leu 


Asp 


Phe 


Trp 


30 


737 


Thr 


ser 


Ser 


Leu 


Pro 


Ala 


Lys 


Thr 


Arg 


Leu 


Asn 


Thr 


Thr 


Phe 


Val 


Arg 




753 


Thr 


Leu 


He 


Glu 


Lys 


Ala 


Asn 


Leu 


Gly 


Leu 


Asp 


Ser 


Leu 


Leu 


Asp 


T/r 


35 


769 


Thr 


Leu 


Gin 


Ala 


Asp 


Pro 


Ser 


Leu 


Glu 


Ala 


Asp 


Leu 


Val 


Thr 


Asp 


Gly 


735 


Lys 


Ser 


Glu 


Pro 


Mec 


Asp 


Phe 


Asn 


Gly 


Ser 


Asn 


Gly 


Leu 


Tyr 


Phe 


Trp 




801 


Glu 


Leu 


Phe 


Phe 


His 


Leu 


Pro 


Phe 


Leu 


Val 


Ala 


Thr 


Arg 


Phe 


Ala 


Asn 


40 


317 


Glu 


Gin 


Gin 


Phe 


ser 


Pro 


Ala 


Gin 


Lys 


Ser 


Leu 


His 


Tyr 


He 


Phe 


Asp 




333 


Pro 


Ala 


Met 


Lys 


Asn 


Lys 


Pro 


His 


Asn 


Ala 


Pro 


Ala 


Tyr 


Trp 


Asn 


Val 


45 


349 


Arg 


Pro 


Leu 


Val 


Glu 


Gly 


Asn 


Ser 


Asp 


Leu 


Ser 


Arg 


His 


Leu 


Asp 


Asp 


365 


Ser 


He 


Asp 


Pro 


Asp 


Thr 


Gin 


Ala 


Tyr 


Ala 


His 


Pro 


Val 


He 


T/r 


Gin 




881 


Lys 


Ala 


Val 


Phe 


He 


Ala 


Tyr 


Val 


Ser 


Asn 


Leu 


He 


Ala 


Gin 


Gly 


Asp 


50 


357 


Mec 


Trp 


Tyr 


Arg 


Gin 


Leu 


Thr 


Arg 


Asp 


Gly 


Leu 


Thr 


Gin 


Ala 


Arg 


Val 




913 


Tyr 


Tyr 


Asn 


Leu 


Ala 


Ala 


Glu 


Leu 


Leu 


Gly 


Pro 


Arg 


Pro 


Asp 


Val 


Ser 


55 


929 


Leu 


Ser 


Ser 


He 


Trp 


Thr 


Pro 


Gin 


Thr 


Leu 


Asp 


Thr 


Leu 


Ala 


Ala 


Gly 


945 


Gin 


Lys 


Ala 


Val 


Leu 


Arg 


Asp 


Phe 


Glu 


His 


Gin 


Leu 


Ala 


Asn 


Ser 


Asp 




961 


Thr 


Ala 


Leu 


Pro 


Ala 


Leu 


pro 


Gly 


Arg 


Asn 


Val 


Ser 


Tyr 


Leu 


Lys 


Leu 


60 


977 


Ala 


Asp 


Asn 


Gly 


Tyr 


Phe 


Asn 


Glu 


Pro 


Leu 


Asn 


Val 


Leu 


Met 


Leu 


Ser 




993 


His 


Trp 


Asp 


Thr 


Leu 


Asp 


Ala 


Arg 


Leu 


Tyr 


Asn 


Leu 


Arg 


His 


Asn 


Leu 


65 


1009 


Thr 


Val 


Asp 


Gly 


Lys 


Pro 


Leu 


Ser 


Leu 


Pro 


Leu 


Tyr 


Ala 


Ala 


Pro 


7a 1 


'25 


Asp 


Pro 


Val 


Ala 


Leu 


Leu 


Ala 


Gin 


Arg 


Ala 


Gin 


Ser 


Gly 


Thr 


Leu 


Thr 



544 

560 
576 
592 
603 
624 
640 
655 
6~2 
638 
704 
720 
736 
752 
763 
734 
300 
316 
332 
343 
364 
380 
396 
312 
523 
944 
9 60 
J" 7 6 
552 
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1041 


Asn 


Gly 


Val 


Ser 


Gly 


Ala 


Met 


Leu 


Thr 


Val 


Pro 


Pro 


Tyr 


Arg 


Phe 


Ser 


10 56 


5 


105" 


Ala 


Met 


Leu 


Pro 


Arg 


Ala 


Tyr 


Ser 


Ala 


Val 


Gly 


Thr 


Leu 


Thr 


Ser 


Phe 


10" 2 


1073 


Gly 


Gin 


Asn 


Leu 


Leu 


Ser 


Leu 


Leu 


Glu 


Arg 


Ser 


Glu 


Arg 


Ala 


Cys 


Gin 


1033 




1039 


Glu 


Glu 


Leu 


Ala 


Gin 


Gin 


Gin 


Leu 


Leu 


Asp 


Met 


Ser 


Ser 


Tyr 


Ala 


He 


1104 


10 


1105 


Thr 


Leu 


Gin 


Gin 


Gin 


Ala 


Leu 


Asp 


Gly 


Leu 


Ala 


Ala 


Asp 


Arg 


Leu 


Ala 


1120 




1121 


Leu 


Leu 


Ala 


Ser 


Gin 


Ala 


Thr 


Ala 


Gin 


Gin 


Arg 


His 


Asp 


His 


Tyr 


Tyr 


1136 


15 


1137 


Thr 


Leu 


Tyr 


Gin 


Asn 


Asn 


lie 


Ser 


Ser 


Ala 


Glu 


Gin 


Leu 


Val 


Met 


Asp 


1152 


1153 


Thr 


Gin 


Thr 


Ser 


Ala 


Gin 


Ser 


Leu 


He 


Ser 


Ser 


Ser 


Thr 


Gly 


Val 


Gin 


1163 




1169 


Thr 


Ala 


Ser 


Gly 


Ala 


Leu 


Lys 


Val 


He 


Pro 


Asn 


He 


Phe 


Gly 


Leu 


Ala 


1134 


20 


1185 


Asp 


Gly 


Gly 


Ser 


Arg 


Tyr 


Glu 


Gly 


Val 


Thr 


Glu 


Ala 


He 


Ala 


He 


Gly 


12 00 




1201 


Leu 


Met 


Ala 


Ala 


Gly 


Gin 


Ala 


Thr 


Ser 


Val 


Val 


Ala 


Glu 


Arg 


Leu 


Ala 


1216 


25 


1217 
1233 


Thr 
Gin 


Thr 
Gin 


Glu 
Ala 


Asn 
Gin 


Tyr 
Ser 


Arg 
Glu 


Arg 
Val 


Arg 
Asp 


Arg 
Ala 


Glu 
Leu 


Glu 
Gin 


Trp 
Lys 


Gin 
Gin 


He 
Leu 


Gin 
Asp 


Tyr 
Ala 


12 3 2 
1243 




1249 


Leu 


Ala 


Val 


Arg 


Glu 


Lys 


Ala 


Ala 


Gin 


Thr 


Ser 


Leu 


Gin 


Gin 


Ala 


Lys 


12 64 


30 


1265 


Ala 


Gin 


Gin 


Val 


Gin 


lie 


Arg 


Thr 


Met 


Leu 


Thr 
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(2) .INFORMATION FOR SEQ ID NO: 60 

(l) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 3132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 

(D) TOPOLOGY: linear 

(» tin MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 icccC) 

5 1 ATG AGT CCG TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA ACA CTC AGC 

1 Mec Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 

49 GTG TTA GAT AAT CGC GGT CTG TCC ATT CGT GAT ATT GGT TTT CAC CGT 
0 17 Val Leu Asp Asn Arg Gly Leu Ser lie Arg Asp He Gly Phe His Arg 



5 



S 



) 



5 



97 ATT GTA ATC GGG GGG GAT ACT GAC ACC CGC GTC ACC CGT CAC CAG TAT 

33 He Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 

145 GAT GCC CGT GGA CAC CTG AAC TAC AGT ATT GAC CCA CGC TTG TAT GAT 

49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 

193 GCA AAG CAG GCT GAT AAC TCA GTA AAG CCT AAT TTT GTC TGG CAG CAT 

65 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 



5 241 GAT CTG GCC GGT CAT GCC CTG CGG ACA GAG AGT GTC GAT GCT GGT CGT 

81 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 

2 89 ACT CTT GCA TTG AAT GAT ATT GAA GGT CGT TCG GTA ATG ACA ATG AAT 

) 97 Thr Val Ala Leu Asn Asp He Glu Gly Arg Ser Val Mec Thr Mec Asn 



3 37 GCG ACC GGT GTT CGT CAG ACC CGT CGC TAT GAA GGC AAC ACC TTG CCC 

113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 

3 85 GCT CGC TTG TTA TCT GTG AGC GAG CAA GTT TTC AAC CAA GAG AGT GCT 

129 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 

43 3 AAA GTG ACA GAG CGC TTT ATC TGG GCT GGG AAT ACA ACC TCG GAG AAA 

145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 



5 481 GAG TAT AAC CTC TCC GGT CTG TGT ATA CGC CAC TAC GAC ACA GCG GGA 

161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly 

529 GTG ACC CGG TTG ATG AGT CAG TCA CTG GCG GGC GCC ATG CTA TCC CAA 

) 177 Val Thr Arg Leu MeC Ser Gin Ser Leu Ala Gly Ala Mec Leu Ser Gin 



57-7 TCT CAC CAA TTG CTG GCG GAA GGG CAG GAG GCT AAC TGG AGC GGT GAC 
193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 
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20 



25 



45 



60 



65 



£25 GAC GAA ACT GTC TGG CAG GGA ATG CTG GCA AGT GAG GTC TAT ACG AC A 
20S Asp Glu Thr Val Trp Gin Gly Mec Leu Ala Ser Glu Val Tyr Thr Thr 



67 3 CAA AGT ACC ACT AAT GCC ATC GGG GCT TTA CTG ACC CAA ACC GAT GCG "20 

225 Gin Ser Thr Thr Asn Ala lie Gly Ala Leu Leu Thr Gin Thr Asp Ala 240 

10 ""21 AAA GGC AAT ATT CAG CGT CTG GCT TAT GAC ATT GCC GGT CAG TTA AAA "63 

241 Lys Gly Asn lie Gin Arg Leu Ala Tyr Asp lie Ala Gly Gin L-ru Lys 256 

"7 69 GGG AGT TGG TTG ACG GTG AAA GGC CAG AGT GAA CAG GTG ATT GTT AAG 316 

15 257 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Glu Gin Val He Val Lys 272 



a 17 TCC CTG AGC TGG TCA GCC GCA GGT CAT AAA TTG CGT GAA GAG CAC GGT a 64 

273 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 233 

865 AAC GGC GTG GTT ACG GAG TAC AGT TAT GAG CCG GAA ACT CAA CGT CTG 912 

289 Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gin Arg Leu 304 

913 ATA GGT ATC ACC ACC CGG CGT GCC GAA GGG AGT CAA TCA GGA GCC AGA 9 50 

305 He Gly lie Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 320 



30 9 61 GTA TTG CAG GAT CTA CGC TAT AAG TAT GAT CCG GTG GGG AAT GTT ATC 1008 

321 Val Leu Gin Asp Leu Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val He 336 

1009 AGT ATC CAT AAT GAT GCC GAA GCT ACC CGC TTT TGG CGT AAT CAG AAA 1056 

35 3 37 Ser He His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gin Lys 3 52 

1057 GTG GAG CCG GAG AAT CGC TAT GTT TAT GAT TCT CTG TAT CAG CTT ATC 1104 

3 53 Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Leu Tyr Gin Leu Mec 3S8 

40 

1105 AGT GCG AC A GGG CGT GAA ATG GCT AAT ATC GGT CAG CAA AGC AAC CAA 1152 

369 Ser Ala Thr Gly Arg Glu Met Ala Asn He Gly Gin Gin Ser Asn Gin 384 

1153 CTT CCC TCA CCC GTT ATA CCT GTT CCT ACT GAC GAC AGC ACT TAT ACC 120 0 

3 85 Leu Pro Ser Pro Val lie Pro Val Pro Thr Asp Asp Ser Thr T/r Thr 4 00 

50 1201 AAT TAC CTT CGT ACC TAT ACT TAT GAC CGT GGC GGT AAT TTG GTT CAA 1243 

401 Asn Tyr Leu Arg Thr Tyr Thr T/r Asp Arg Gly Gly Asn Leu Val Gin 416 

1249 ATC CGA CAC AGT TCA CCC GCG ACT CAA AAT AGT TAC ACC ACA GAT ATC 12S6 

55 417 He Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp lie 432 



1297 ACC GTT TCA AGC CGC AGT AAC CGG GCG GTA TTG AGT ACA TTA ACG ACA 1344 

433 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 4 43 

13 45 GAT CCA ACC CGA GTG GAT GCG CTA TTT GAT TCC GGC GGT CAT CAG AAG 13 92 

449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 4 64 

13 9 3 ATG TTA ATA CCG GGG CAA AAT CTG GAT TGG AAT ATT CGG GGT GAA TTG J. 4 40 
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4-55 Met Leu He Pro Gly Gin Asn Leu Asp Trp Asn lie Arg Gly Giu Leu -; i .:■ 

1441 CAA CGA GTC AC A CCG GTG AGC CGT GAA AAT AGC AGT GAC AGT GAA TGG 14 33 

5 431 Gin Arg Val Thr Pro 7a 1 Ser Arg Glu Asn Ser Ser Asp Ser Glu Trp 4 Jo 

1439 TAT CGC TAT AGC AGT GAT GGC ATG CGG CTG CTA AAA GTG AGT GAA CAG 15 36 

4S7 Tyr Arg Tyr Ser Ser Asp Gly Mec Arg Leu Leu Lys Val Ser Glu Gin c l" 

10 

1537 CAG ACG GGC AAC AGT ACT CAA GTA CAA CGG GTG ACT TAT CTG CCG GGA 1534 

513 Gin Thr Gly Asn Ser Thr Gin val Gin Arg Val Thr Tyr Leu Pro Gly 528 

15 

1535 TTA GAG CTA CGG AC A ACT GGG GTT GCA GAT AAA AC A ACC GAA GAT TTG 16 32 

529 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Glu Asp Leu 544 

20 1633 CAG GTG ATT ACG GTA GGT GAA GCG GGT CGC GCA CAG GTA AGG GTA TTG 1630 

545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 5 60 

1631 CAC TGG GAA AGT GGT AAG CCG ACA GAT ATT GAC AAC AAT CAG GTG CGC 1723 

25 561 His Trp Glu ser Gly Lys pro Thr Asp lie Asp Asn Asn Gin Val Arg 576 



30 



35 



50 



55 



1729 TAC AGC TAC GAT AAT CTG CTT GGC TCC AGC CAG CTT GAA CTG GAT AGC 1776 

577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 592 

1777 GAA GCG CAG ATT CTC AGT CAG GAA GAG TAT TAT CCG TAT GGC GGT ACG 1324 

593 Glu Gly Gin He Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 608 

1825 GCG ATA TGG GCG GCG AGA AAT CAG ACA GAA GCC AGC TAC AAA TTT ATT 137 2 

609 Ala He Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe lie 624 



40 1373 CGT TAC TCC GGT AAA GAG CGG GAT GCC ACT GGA TTG TAT TAT TAC GGC IS 20 

625 Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly 640 

1321 TAC CGT TAT TAT CAA CCT TGG GTG GGT CGA TGG TTG AGT GCT GAT CCG 19 63 

45 641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Leu ser Ala Asp Pro 656 



1969 GCG GGA ACC GTG GAT GGG CTG AAT TTG TAC CGA ATG GTG AGG AAT AAC 2 0 16 

657 Ala Gly Thr Val Asp Gly Leu Asn Leu Tyr Arg Met Val Arg Asn Asn 672 

2017 CCC ATC ACA TTG ACT GAC CAT GAC GGA TTA GCA CCG TCT CCA AAT AGA 2 0 64 

673 Pro He Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 633 

2065 AAT CGA AAT ACA TTT TGG TTT GCT TCA TTT TTG TTT CGT AAA CCT GAT 2 112 

689 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro Asp 7 04 



60 2113 GAG GGA ATG TCC GCG TCA ATG AGA CGG GGA CAA AAA ATT GGC AGA GCC 2 160 

705 Glu Gly Mec Ser Ala Ser Mec Arg Arg Gly Gin Lys He Gly Arg Ala "20 

2161 ATT GCC GGC GGG ATT GCG ATT GGC GGT CTT GCG GCT ACC ATT GCC GCT 2 2 03 
65 721 He Ala Gly Gly He Ala He Gly Gly Leu Ala Ala Thr He Ala Ala 
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22 09 ACG GCT GGC GCG GCT ATC CCC GTC ATT CTC GGG GTT GCG GCC GTA GGC 
737 Thr Ala Gly Ala Ala lie Pro Val lis Leu Gly Val Ala Ala Val Gly 



2257 GCG GGG ATT GGC GCG TTG ATG GGA TAT AAC GTC GGT AGC CTG CTG GAA 
"53 Ala Gly lie Gly Ala Leu Met Gly Tyr Asn Val Gly Ser Leu Leu Glu 



23 05 .AAA GGC GGG GCA TTA CTT GCT CGA CTC GTA CAG GGG AAA TCG ACG TTA 
769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly L/s Ser Thr L^u 



23 5 3 GTA CAG TCG GCG GCT GGC GCG GCT GCC GGA GCG AGT TCA GCC GCG GCT 
785 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 



24 01 TAT GGC GCA CGG GCA CAA GGT GTC GGT GTT GCA TCA GCC GCC GGG GCG 
301 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 



2449 GTA ACA GGG GCT GTG GGA TCA TGG ATA AAT AAT GCT GAT CGG GGG ATT 
317 val Thr Gly Ala Val Gly Ser Trp lie Asn Asn Ala Asp Arg Gly lie 



2497 GGC GGC GCT ATT GGG GCC GGG AGT GCG GTA GGC ACC ATT CAT ACT ATG 
333 Gly Gly Ala lie Gly Ala Gly Ser Ala Val Gly Thr lie Asp Thr Met 



2545 TTA GGG ACT GCC TCT ACC CTT ACC CAT GAA GTC GGG GCA GCG GCG GGT 
349 Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly 



2593 GGG GCG GCG GGT GGG ATG ATC ACC GGT ACG CAA GGG AGT ACT CGG GCA 
865 Gly Ala Ala Gly Gly Met He Thr Gly Thr Gin Gly Ser Thr Arg Ala 



2641 GGT ATC CAT GCC GGT ATT GGC ACC TAT TAT GGC TCC TGG ATT GGT TTT 
831 Gly He His Ala Gly lie Gly Thr Tyr Tyr Gly ser Trp He Gly Phe 



2689 GGT TTA GAT GTC GCT AGT AAC CCC GCC GGA CAT TTA GCG AAT TAC GCA 
897 Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala 



27 3 7 GTG GGT TAT GCC GCT GGT TTG GGT GCT GAA ATG GCT GTC AAC AGA ATA 
913 Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Met Ala Val Asn Arg He 



2785 ATG GGT GGT GGA TTT TTG AGT AGG CTC TTA GGC CGG GTT GTC AGC CCA 
929 Mec Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 



283 3 TAT GCC GCC GGT TTA GCC AGA CAA TTA GTA CAT TTC AGT GTC GCC AGA 
945 Tyr Ala Ala Gly Leu Ala Arg Gin Leu Val His Phe Ser Val Ala Arg 



2881 CCT GTC TTT GAG CCG ATA TTT AGT GTT CTC GGC GGG CTT GTC GGT GGT 
961 Pro Val Phe Glu Pro He Phe Ser Val Leu Gly Gly Leu Val Gly Gly 



2929 ATT GGA ACT GGC CTG CAC AGA GTG ATG GGA AGA GAG AGT TGG ATT TCC 
977 He Gly Thr Gly Leu His Arg Val Met Cly Arg Glu Ser Trp He Ser 



2 977 AGA GCG TTA AGT GCT GCC GGT AGT GGT ATA GAT CAT GTC GCT GGC ATG 
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m Arg Ala Leu Ser Ala Ala Oly Ser Giy lie Asp His Vai Ala jiy Me: 

3 025 ATT GGT AAT CAG ATC AGA GGC AGG GTC TTG ACC ACA ACC GGG ATC OCT 3 
1009 lie Gly Asn Gin lie Arg Gly Arg Val Leu Thr Thr Thr Gly lie Aia i 



j Li 



3 0* 7 3 AAT GCG ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC GCA CGA CGA GTT 1">0 
i0 - 3 Asn Ala He Asp Tyr Gly Thr Ser Ala Val Gly Ala Ala Arg Arg Val l:1o 

3121 TTT TCT TTG TAA 3132 
10-11 Phe Ser Leu End 1043 

15 

(2) INFORMATION FOR SEQ ID NO: 61 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 3 amino acids 

(B) TYPE: amino acid 
20 (C) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 (TccC peptide) 

1 Met Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 16 

17 Val Leu Asp Asn Arg Gly Leu Ser lie Arg Asp He Gly Phe His Arg 3 2 

33 He Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 43 

49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 64 

35 65 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His io 

31 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 9 6 

97 Thr Val Ala Leu Asn Asp He Glu Gly Arg Ser Val Met Thr Met Asn 11"> 

40 

113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 123 

129 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 144 

45 145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 150 

161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly l~6 

177 Val Thr Arg Leu Met Ser Gin Ser Leu Ala Gly Ala Met Leu Ser Gin 192 

193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 203 

209 Asp Glu Thr Val Trp Gin Gly Met Leu Ala Ser Glu Val Tyr Thr Thr 22 4 

55 225 Gin Ser Thr Thr Asn Ala He Gly Ala Leu Leu Thr Gin Thr Asp Ala 24C 

241 Lys Gly Asn He Gin Arg Leu Ala Tyr Asp He Ala Gly Gin Leu Lys 256 

257 Gly Ser Trp Leu Thr Val Lys Gly Gin ser Glu Gin Val He Val Lys 2~2 

60 

2 n 3 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 233 

239 Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gin Arg Leu 30 4 

65 305 He Gly lie Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 320 
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We claim: 

1. A composition, comprising an effective amount of a 
Phocorhabdus protein toxin that has functional activity against 
an insect . 

2. The composition of Claim 1, wherein the Phocorhabdus 
toxin is produced by a purified culture of Phocorhabdus , a 
transgenic plant, Baculovirus, or heterologous microbial host. 

3. The composition of Claim 2, wherein the Phocorhabdus 
toxin produced by a purified culture of Phocorhabdus luminescens . 

4. The composition of Claim 2, wherein the toxin is 

15 produced from a purified culture of Phocorhabdus luminescens 
strain designated ATCC 55397. 

5. The composition of Claim 2, wherein the toxin is 
produced by a purified culture of Phocorhabdus luminescens strain 

20 designated W-14. 

6. The composition of Claim 1, wherein the toxin is 
produced by a purified culture of Phocorhabdus strain designated 
WX-1, WX-2, WX-3, WX-4, WX-S, WX6 , WX-7 , WX-8, WX-9, WX-10, WX- 

25 11, WX-12, WX-14. WX-15, H9, Hb, Hm. HP88, NC-1, W30, WIR, ATCC# 
43948, ATCC# 43949, ATCC# 43950, ATCC# 43951, or ATCC# 43952. 

7. The composition of Claim 2, wherein the toxin is 
produced from a purified culture of Phocorhabdus luminescens 

30 strain designated WX-1, WX-2, WX-3, WX-4. WX-5, WX-6, WX-7, wx-8, 
WX-9, WX-10, WX-11, WX-12, WX-14, WX-15, H9 , Hb, Hm, HP88, NC-1, 
W30, WIR, ATCC# 43948, ATCC# 43949, ATCC# 43950, ATCC# 43951, or 
ATCC# 4 3 952. 

35 8. The composition of Claim 1, wherein the toxin is 

respresented by amino acid sequence is SEQ ID NO: 12. 

i 

9. The composition of Claim 6, wherein the composition is a 
mixture of one or more toxins produced from purified cultures of 
40 Phocorhabdus . 
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10. The composition of Claim 1 or 6, wherein the insert 15 
of the order Lapidopcara. Coleopcara, Hymenopcera, Dipcera, 
Diccyopcara , Acarina or Homopcera . 

5 

11. The composition of Claim 1 or 6, wherein the insect 
species is from order Coleopcera and is Southern Corn Rootworm. 
Western Corn Rootworm, Colorado Potato Beetle, Mealworm, Boll 
Weevil or Turf Grub. 

10 

12. The composition of Claim 1 or 6, wherein the insect 
species is from order Lepidopcera and is Beet Armyworm, Black 
Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European 
Corn Borer, Tobacco Hornworm, or Tobacco Budworm. 

15 

13. The composition of Claim 1 or 6, wherein the toxin is 
formulated as a sprayable insecticide. 



14. The composition of Claim 1 or Claim 6, wherein the 
20 toxin is formulated as a bait matrix and delivered in an above 

ground or below ground bait station. 

15. A method of controlling an insect, comprising orally 
delivering to an insect an effective amount of a protein toxin 

25 that has functional activity against an insect, wherein the 

protein is produced by a purified bacterial culture of the genus 
Phocorhabdus . 



16. The method of Claim 15, wherein the bacterium is a 
30 purified culture of Phocorhabdus luminescens . 

17. The method of Claim 15, wherein the toxin is produced 
from a purified culture of Phocorhabdus luminescens strain 
designated ATCC 553 97. 

35 

18. The method of Claim 16, wherein the toxin is produced 
from a purified culture of Phocorhabdus luminescens strain 
designated W-14. 
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19. The method of Claim 15, wherein the coxin is produced 
from a purified culture of Phozo-rhabdus strains designated wx-l, 
WX-2. WX-3, WX-4. WX-5, WX-6, WX-7, WX-3, WX-9, WX-iO, WX-ii. WX- 
12, WX-14, WX-15, H9, Hb, Hm, HP88, NC-1. W30, WIR, ATCC# 43943, 

5 ATCC* 43949, ATCC# ATCC* 43950, ATCC# 43951, or ATCC# 43952. 

20. The mechod of Claim 15, wherein the toxin is produced 
from a purified culture of Phocorhabdus luminescens strains 
designated WX-1, WX-2, WX-3, WX-4, WX-5. WX-6, WX-7, WX-8, WX-9, 

10 WX-10, WX-11, WX-12. WX-14, WX-15, H9 , Hb, Hm, HP88, NC-1. W30, 

WIR. ATCC# 43948, ATCC# 43949, ATCC# ATCC# 43950, ATCC# 43951, or 
ATCC# 43 9 52. 

21. The method of Claim 19, wherein a mixture of one or 
15 more toxins is produced from a purified culture of Phocorhabdus 

and said toxins are orally delivered to an insect. 

22. The method of Claim 15, wherein the toxin is produced 
by a prokaryotic host transformed with a gene encoding the toxin. 

20 

23. The method of Claim 15, wherein the toxin is produced 
by a eukaryotic host transformed with a gene encoding the toxin. 

24. The method of Claim 23, wherein the eukaryotic host is 
25 baculovirus. 

25. The method of Claim 15 or 19, wherein the insect is of 
the order Lepidopcera, Coleopcera, Hymenopcara, Dipcera, 
Diccyopcera, Acarina or Homopcara . 

30 

26. The method of Claim 15 or 19. wherein the insect 
species is from order Coleopcera and is Southern Corn Rootworm, 
western Corn Rootworm, Colorado Potato Beetle, Mealworm, Boll 
Weevil or Turf Grub. 

35 

27. The method of Claim 15 or 19, wherein the insect 
species is from order Lepidopcera and is Beet Armyworm, Black 
Cutworm, Cabbage Looper, Codling Moth, Corn Earworm, European 
Corn Borer, Tobacco Hornworm, or Tobacco Budworm. 

40 
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Ic. The method of Claim 15 or 19, wherein the toxin : = 
formulated as a sprayable insecticide. 

29. The method of Claim 15 or Claim 19, wherein the toxin 
5 is formulated as a bait matrix and delivered in an above ground 
or below ground bait station. 



30. A method of isolating a gene coding for a protein 
subunit, comprising the steps of: constructing at least one RIJA 

10 or DNA oligonucleotide molecule that corresponds to at least a 
part of a DNA coding region of an amino acid sequence selected 
from a group consisting of SEQ ID NO:l, SEQ ID NO: 2, SEQ ID NO: 3, 
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID HO : 8 , 
SEQ ID MO: 9, SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID 

15 NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO: 19. 
SEQ ID NO:20. SEQ ID NO: 21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID 
NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, 
SEQ ID NO: 42, and SEQ ID NO: 43, wherein the nucleotide molecule 
is used to isolate genetic material from Phocorhabdus or 

20 Phocorhabdus luminescens . 



31. A method for expressing a protein produced by a 
purified bacterial culture of the genus Phocorhabdus in a 
prokaryotic or eukaryotic host in an effective amount so that the 

25 protein has functional activity against an insect, wherein the 
method comprises: constructing a chimeric DNA construct having 
5' to 3' a promoter, a DNA sequence encoding a protein, a 
transcription terminator, and then transferring the chimeric DNA 
construct into the host. 

30 

32. The method of Claim 31, wherein the protein has 
functional activity against insects selected from a group 
consisting of Coleopcera , Lepidopcera , Dipcera. Howopcera , 
Hy/nenopcrera , Diccyopcera , and Acarina . 

35 

33. The method of Claim 31, wherein the protein encoded by 
the DNA sequence has an N-terminal amino acid sequence selected 
from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID 
NO: 3, SEQ ID NO : 4 , SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID 

40 HO: 8, SEQ ID NO: 9. SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ 
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ID IJO-.iS, SEQ ID 110:16, SEQ ID NO : 1 " , SEQ ID 110:13. SEQ ID NO:^.-. 
SEQ ID tlO:20, SEQ ID N0:2i. SEQ ID NO: 22. SEQ ID NO:22, SEQ ID 
HO: 24. SEQ ID HO: 38. SEQ ID NO: 39, SEQ ID HO: 40, SEQ ID HO: 41, 
SEQ ID MO: 42, and SEQ ID NO: 43. 

5 

34. The mechod of Claim 31, wherein the protein encoded by 
the DMA sequence includes the amino acid sequence selected from 
Che group consisting of SEQ ID NO: 12. SEQ ID NO:26, SEQ ID HO:23. 
SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:35, SEQ ID 
10 NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 55, 
SEQ ID NO: 57, SEQ ID NO: 59 and SEQ ID NO: 61. 



35. A chimeric DNA construct, adapted for expression in a 
prokaryotic or eukaryotic host comprising, 5' to 3 ' a 
15 transcriptional promoter active in the host; a DNA sequence 
encoding a Photorhabdus protein that has functional activity 
against an insect; and a transcriptional terminator. 



36. A chimeric DNA construct of Claim 35, wherein the 

20 protein encoded by the DNA sequence has an N-terminal amino acid 
sequence selected from the group consisting of SEQ ID NO:l, SEQ 
ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ 
ID MO: 7. SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO: 13, 
SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID 

25 NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:38, SEQ ID NO:39, SEQ ID 
NO:40, SEQ ID NO:41, SEQ ID NO:42, and SEQ ID NO:43. 

37. The chimeric DNA construct of Claim 35, wherein the 
30 protein encoded by the DNA sequence has an amino acid sequence 

selected from the group consisting of SEQ ID NO: 12, SEQ ID NO:26, 

SEQ ID MO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 

HO:35, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, 

SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID NO: 61. 



35 



38. The chimeric DNA construct of Claim 35, wherein the DMA 
sequence encoding the Photorhabdus luminescens protein is 
selected from the group comprising SEQ ID NO: 11, SEQ ID MO: 25, 

SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO : 3 1 , SEQ ID NO:33, SEQ ID 
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NO:46, SEQ ID NO:48, SEQ ID NO:50. SEQ ID NO:52, SEQ ID IJO : 54 . 
SEQ ID NO:56, SEQ ID MO: 58, and SEQ ID NO:60. 

39. The chimeric DMA construct of Claim 35, wherein the 
5 host is baculovirus. 

40. An isolated and substantially purified preparation 
comprising, a DNA molecule capable of encoding an effective 
amount of a protein that is produced by a bacterium of the genus 

10 Phocorhabdus and that has functional activity against an insect. 

41. The preparation of Claim 40, wherein the bacterium is 
Phocorhabdus luminescens . 

15 42 . A purified preparation comprising, a protein produced 

by Phocorhabdus or Phocorhabdus Iwninescens having an N-terminal 
amino acid sequence selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO:2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID 
NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10. SEQ ID 

20 NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, 
SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID 
NO:22, SEQ ID NO:23, SEQ ID NO:24. SEQ ID NO:38, SEQ ID NO:39, 
SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, and SEQ ID NO: 43. 

25 43 . A purified protein preparation comprising, a protein 

that has an N-terminal amino acid sequence selected from the 
group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO: 3, SEQ ID 
NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID 
NO: 9, and SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, 

30 SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID 
NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO: 38. SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID 
NO: 42, and SEQ ID NO: 43. 

35 44. A purified protein preparation comprising, a protein 

selected from the group of SEQ ID NO: 12, SEQ ID NO: 26, SEQ ID 
NO: 28, SEQ ID NO: 30, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID MO: 35, 
SEQ ID NO.-47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID 
MO: 55, SEQ ID NO: 57, SEQ ID NO: 59, and SEQ ID NO: 61. 

40 
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45. A purified DUA preparacion comprising, i DIIA seguen 
selected from che group consisting of SEQ ID N0:11, SEQ ID NO-.15. 
SEQ ID HO:27, SEQ ID NO:29, SEQ ID NO:31. SEQ ID NO:33, SEQ ID 
NO: 46. SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID HO: 54, 

5 SEQ ID NO: 56, SEQ ID NO: 58 and SEQ ID NO: 60, wherein Che DNA 
sequence is isolated from its native host. 

46. A purified protein preparation comprising, a 
Phocorhabdus luminescens protein with at least one subunit having 

10 an approximate molecular weight between 18 kDa to about 2 30 kDa ; 
between about 160 kDa to about 230 kDa; 100 kDa to 160 kDa; about 
80 kDa to about 100 kDa; or about 50 kDa to about 80 kDa. 



47. A purified protein preparation comprising, a 

15 Phocorhabdus luminescens protein with at least one subunit having 
an approximate molecular weight of about 280 kDa. 

48. A substantially pure microorganism culture comprising, 
ATCC 553 97. 

20 

49. The culture of Claim 48, wherein the culture is a 
derivative of ATCC 55397 that produces a protein toxin that has 
functional activity against an insect. 

25 50. A substantially pure microorganism culture comprising, 

H9. 

51. A substantially pure microorganism culture comprising, 

Hb. 

30 

52. A substantially pure microorganism culture comprising, 

Hm. 



53. A substantially pure microorganism culture comprising, 

35 HP88. 



54. A substantially pure microorganism culture comprising, 

HC-1. 



40 55. A substantially pure microorganism culture comprising, 
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56. A substantially pure microorganism culture comprising, 

WIR. 

5 

57. A transgenic plant comprising in its genome, a chimeric 
artificial gene construction imbuing the plant with an ability co 
express an effective amount of a Phocorhabdus protein that has 
functional activity against an insect. 

10 

58. The transgenic plant of Claim 57, wherein the plane is 
transformed using acceleration of genetic material coated onto 
microparticles directly into cells, Agrobacceria , whiskers, or 
electroporat ion techniques 

15 

59. The transgenic plant of Claim 57, wherein the 
selectable marker is selected from the group consisting of 
kanamycin, neomycin, glyphosate, hygromycin, methotrexate, 
phosphinothricin (bialophos), chlorosulf uron, bromoxynil, dalapon 

20 and the like. 

60. The transgenic plant of Claim 57, wherein the promoter 
is selected from the group consisting of octopine synthase, 
nopaline synthase, mannopine synthase, 35S, 19S, ribulose- 1 , 6- 

25 bisphosphate (RUBP) carboxylase small subunit (ssul , beta- 

conglycinin, phaseolin, alcohol dehydrogenase (ADH) , heat- shock, 
ubiquitin, zein. oleosin, napin, or acyl carier protein (ACP) . 



61. The transgenic plant of Claim 57, wherein embryogenic 
30 tissue, callus tissue type I or II, hypocotyl, meristem, or plant 
tissue during dedif f erentiation is used in preparing the 
transgenic plant. 



62. The transgenic plant of Claim 57, wherein the chimeric 
35 gene is a DNA sequence which encodes a Phocorhabdus protein that 
has functional activity against an insect and at least one codon 
of the gene has been modified so that the codon is a plant 
preferred codon. 
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63. A method of controlling an insect comprising orally 
delivering to an insect an effective amount of a protein toxin, 
wherein the protein is produced by a transgenic plant, which said 
insect feeds . 

64. A composition of matter, comprising a purified DNA 
sequence from a purified bacterial culture from the genus 
Phocorhabdus . 
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